Skip to main content
Log in

GPLFR—Global perspective and local flow registration-for forward-looking sonar images

  • Original Article
  • Published:
Neural Computing and Applications Aims and scope Submit manuscript

Abstract

Forward-looking sonar (FLS) image registration is a key step in many underwater applications such as underwater target detection, ocean observation, and mapping. However, low resolution, low signal-to-noise ratio, and the complex nonlinear transformation relationship between FLS images from two different viewpoints have brought great challenges to register them. In order to better cope with this challenge, we propose a global perspective and local flow registration (GPLFR) method for FLS images. GPLFR consists of two networks, i.e., a regression correction network (RCNet) and a deformable network (IRRDNet) with the iterative refinement of the residual. For a given pair of FLS images, RCNet is used to estimate the global transformation parameters to achieve global registration, and then, IRRDNet is used to estimate the deformation field or flow field to realize local alignment. The experimental results on real FLS image and 2D face expression image registration tasks demonstrate the effectiveness and robustness of the proposed method.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

References

  1. Zitova B, Flusser J (2003) Image registration methods: a survey. Image Vis Comput 21(11):977–1000

    Article  Google Scholar 

  2. Liu J, Gong J, Guo B, Zhang W (2017) A novel adjustment model for mosaicking low-overlap sweeping images. IEEE Trans Geosci Remot Sens 55(7):4089–4097

    Article  Google Scholar 

  3. Goshtasby AA, Nikolov S (2007) Image fusion: advances in the state of the art. Infor fus 2(8):114–118

    Article  Google Scholar 

  4. Zanetti M, Bruzzone L (2017) A theoretical framework for change detection based on a compound multiclass statistical model of the difference image. IEEE Trans Geosci Remot Sens 56(2):1129–1143

    Article  Google Scholar 

  5. Vakalopoulou M, Karantzalos K, Komodakis N, Paragios N (2015) Simultaneous registration and change detection in multitemporal, very high resolution remote sensing data. In: Proceedings of the IEEE conference on computer vision and pattern recognition workshops, pp 61–69

  6. Negahdaripour S, Firoozfam P, Sabzmeydani P (2005) On processing and registration of forward-scan acoustic video imagery. In: The 2nd Canadian conference on computer and robot vision (CRV’05), IEEE, pp 452–459

  7. Li H, Dong Y, He X, Xie S, Luo J (2014) A sonar image mosaicing algorithm based on improved sift for usv. In: 2014 IEEE International conference on mechatronics and automation, IEEE, pp 1839–1843

  8. Negahdaripour S, Aykin M, Sinnarajah S (2011) Dynamic scene analysis and mosaicing of benthic habitats by fs sonar imaging-issues and complexities. In: OCEANS’11 MTS/IEEE KONA, IEEE, pp 1–7

  9. Yang Z, Dan T, Yang Y (2018) Multi-temporal remote sensing image registration using deep convolutional features. IEEE Access 6:38544–38555

    Article  Google Scholar 

  10. Balakrishnan G, Zhao A, Sabuncu MR, Guttag J, Dalca AV (2019) Voxelmorph: a learning framework for deformable medical image registration. IEEE Trans Medical Imag 38(8):1788–1800

    Article  Google Scholar 

  11. Zhao S, Dong Y, Chang EI, Xu Y, et al. (2019) Recursive cascaded networks for unsupervised medical image registration. In: Proceedings of the IEEE international conference on computer vision, pp 10600–10610

  12. de Vos BD, Berendsen FF, Viergever MA, Staring M, Išgum I (2017) End-to-end unsupervised deformable image registration with a convolutional neural network. In: Deep learning in medical image analysis and multimodal learning for clinical decision support, Springer, pp 204–212

  13. Galceran E, Djapic V, Carreras M, Williams DP (2012) A real-time underwater object detection algorithm for multi-beam forward looking sonar. IFAC Proceed Vol 45(5):306–311

    Article  Google Scholar 

  14. Quidu I, Jaulin L, Bertholom A, Dupas Y (2012) Robust multitarget tracking in forward-looking sonar image sequences using navigational data. IEEE J Ocean Eng 37(3):417–430

    Article  Google Scholar 

  15. Clark DE, Bell J (2005) Bayesian multiple target tracking in forward scan sonar images using the phd filter. IEE Proceed-Radar, Sonar Navigat 152(5):327–334

    Article  Google Scholar 

  16. Petillot Y, Ruiz IT, Lane DM (2001) Underwater vehicle obstacle avoidance and path planning using a multi-beam forward looking sonar. IEEE J Ocean Eng 26(2):240–251

    Article  Google Scholar 

  17. Hurtos N, Ribas D, Cufí X, Petillot Y, Salvi J (2015) Fourier-based registration for robust forward-looking sonar mosaicing in low-visibility underwater environments. J Field Robot 32(1):123–151

    Article  Google Scholar 

  18. Hurtós N, Nagappa S, Cufí X, Petillot Y, Salvi J (2013) Evaluation of registration methods on two-dimensional forward-looking sonar imagery. In: 2013 MTS/IEEE OCEANS-Bergen, IEEE, pp 1–8

  19. Hurtós N, Petillot Y, Salvi J, et al. (2012) Fourier-based registrations for two-dimensional forward-looking sonar image mosaicing. In: 2012 IEEE/RSJ International conference on intelligent robots and systems, IEEE, pp 5298–5305

  20. Zhang J, Sohel F, Bian H, Bennamoun M, An S (2016) Forward-looking sonar image registration using polar transform. In: OCEANS 2016 MTS/IEEE Monterey, IEEE, pp 1–6

  21. Aykin M, Negahdaripour S (2012) On feature extraction and region matching for forward scan sonar imaging. In: 2012 Oceans, IEEE, pp 1–9

  22. Sekkati H, Negahdaripour S (2007) 3-d motion estimation for positioning from 2-d acoustic video imagery. In: Iberian conference on Pattern Recognition and Image Analysis, Springer, pp 80–88

  23. Hurtós Vilarnau N, et al. (2014) Forward-looking sonar mosaicing for underwater environments

  24. Hurtós N, Palomeras N, Nagappa S, Salvi J (2013) Automatic detection of underwater chain links using a forward-looking sonar. In: 2013 MTS/IEEE OCEANS-Bergen, IEEE, pp 1–7

  25. Guo Y, Wei L, Xu X (2020) A sonar image segmentation algorithm based on quantum-inspired particle swarm optimization and fuzzy clustering. Neural Comput Appl 32(22):16775–16782

    Article  Google Scholar 

  26. Zhao S, Lau T, Luo J, Eric I, Chang C, Xu Y (2019) Unsupervised 3d end-to-end medical image registration with volume tweening network. IEEE J Biomed Health Infor 24(5):1394–1404

    Article  Google Scholar 

  27. Hur J, Roth S (2019) Iterative residual refinement for joint optical flow and occlusion estimation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 5754–5763

  28. Brown LG (1992) A survey of image registration techniques. ACM Comput Surveys (CSUR) 24(4):325–376

    Article  Google Scholar 

  29. Lowe DG (1999) Object recognition from local scale-invariant features. Proceedings of the seventh IEEE International conference on computer vision, IEEE 2:1150–1157

  30. Bay H, Tuytelaars T, Van Gool L (2006) Surf: Speeded up robust features. In: European conference on computer vision, Springer, pp 404–417

  31. Rublee E, Rabaud V, Konolige K, Bradski G (2011) Orb: An efficient alternative to sift or surf. In: 2011 International conference on computer vision, IEEE, pp 2564–2571

  32. Fischler MA, Bolles RC (1981) Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography. Communications ACM 24(6):381–395

    Article  MathSciNet  Google Scholar 

  33. Moisan L, Moulon P, Monasse P (2012) Automatic homographic registration of a pair of images, with a contrario elimination of outliers. Image Process Line 2:56–73

    Article  Google Scholar 

  34. Raguram R, Chum O, Pollefeys M, Matas J, Frahm JM (2012) Usac: a universal framework for random sample consensus. IEEE Trans Patt Anal Mach Intell 35(8):2022–2038

    Article  Google Scholar 

  35. Tao W, Zhao J, Liu J, Zhang H (2010) Study on the side-scan sonar image matching navigation based on surf. In: 2010 International conference on electrical and control engineering, IEEE, pp 2181–2184

  36. Gai S, Xu X, Xiong B (2020) Paper currency defect detection algorithm using quaternion uniform strength. Neural computing and applications pp 1–18

  37. Viola P, Wells WM III (1997) Alignment by maximization of mutual information. International J Comput Vis 24(2):137–154

    Article  Google Scholar 

  38. Maes F, Collignon A, Vandermeulen D, Marchal G, Suetens P (1997) Multimodality image registration by maximization of mutual information. IEEE Transact Med Imag 16(2):187–198

    Article  Google Scholar 

  39. Wang G, Xu X, Jiang X, Ding S (2016) Medical image registration based on self-adapting pulse-coupled neural networks and mutual information. Neur Comput Appl 27(7):1917–1926

    Article  Google Scholar 

  40. Briechle K, Hanebeck UD (2001) Template matching using fast normalized cross correlation. Optical Pattern Recognition XII. Int Soci Optic Phot 4387:95–102

    Google Scholar 

  41. Sarvaiya JN, Patnaik S, Bombaywala S (2009) Image registration by template matching using normalized cross-correlation. In: 2009 International conference on advances in computing, control, and telecommunication technologies, IEEE, pp 819–822

  42. Das A, Bhattacharya M (2011) Affine-based registration of CT and MR modality images of human brain using multiresolution approaches: comparative study on genetic algorithm and particle swarm optimization. Neural Comput Appl 20(2):223–237

    Article  Google Scholar 

  43. Song S, Herrmann JM, Si B, Liu K, Feng X (2017) Two-dimensional forward-looking sonar image registration by maximization of peripheral mutual information. Int J Adv Robot Sys 14(6):1729881417746270

    Google Scholar 

  44. Valdenegro-Toro M (2017) Improving sonar image patch matching via deep learning. In: 2017 European conference on mobile robots (ECMR), IEEE, pp 1–6

  45. Sarnel H, Senol Y (2011) Accurate and robust image registration based on radial basis neural networks. Neural Comput Appl 20(8):1255–1262

    Article  Google Scholar 

  46. Ot P, dos Santos MM, Drews PLJ, da Costa Botelho SS, et al. (2017) Forward looking sonar scene matching using deep learning. In: 2017 16th IEEE International Conference on Machine Learning and Applications (ICMLA), IEEE, pp 574–579

  47. Cheng X, Zhang L, Zheng Y (2018) Deep similarity learning for multimodal medical images. Comput Method Biomech Biomed Eng: Imag Visual 6(3):248–252

    Google Scholar 

  48. DeTone D, Malisiewicz T, Rabinovich A (2016) Deep image homography estimation. arXiv:1912.02942

  49. Chee E, Wu Z (2018) Airnet: Self-supervised affine registration for 3d medical images using neural networks. arXiv:1810.02583

  50. Sokooti H, De Vos B, Berendsen F, Lelieveldt BP, Išgum I, Staring M (2017) Nonrigid image registration using multi-scale 3d convolutional neural networks. In: International conference on medical image computing and computer-assisted intervention, Springer, pp 232–239

  51. Sokooti H, de Vos B, Berendsen F, Ghafoorian M, Yousefi S, Lelieveldt BP, Isgum I, Staring M (2019) 3d convolutional neural networks image registration based on efficient supervised learning from artificial deformations. arXiv:1908.10235

  52. Fu Y, Lei Y, Wang T, Curran WJ, Liu T, Yang X (2020) Deep learning in medical image registration: a review. Phys Medic Biol 65:20TR01

    Article  Google Scholar 

  53. Jaderberg M, Simonyan K, Zisserman A, et al. (2015) Spatial transformer networks. In: Advances in neural information processing systems, pp 2017–2025

  54. Ronneberger O, Fischer P, Brox T (2015) U-net: Convolutional networks for biomedical image segmentation. In: International conference on medical image computing and computer-assisted intervention, Springer, pp 234–241

  55. Zou W, Luo Y, Cao W, He Z, He Z (2021) A cascaded registration network rcinet with segmentation mask. Neural Computing and Applications pp 1–17

  56. Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition. arXiv:1409.1556

  57. Von Gioi RG, Jakubowicz J, Morel JM, Randall G (2012) Lsd: a line segment detector. Image Process Line 2:35–55

    Article  Google Scholar 

  58. Liu R, Lehman J, Molino P, Petroski Such F, Frank E, Sergeev A, Yosinski J (2018) An intriguing failing of convolutional neural networks and the coordconv solution. Adv Neural Infor Process Sys 31:9605–9616

    Google Scholar 

  59. Handa A, Bloesch M, Pătrăucean V, Stent S, McCormac J, Davison A (2016) gvnn: Neural network library for geometric computer vision. In: European conference on computer vision, Springer, pp 67–82

  60. Cheng H, GUPTA K (1989) An historical note on finite rotations. J Appl Mech 56(1):139–145

    Article  MathSciNet  Google Scholar 

  61. Gallego G, Yezzi A (2015) A compact formula for the derivative of a 3-d rotation in exponential coordinates. J Math Imag Vis 51(3):378–384

    Article  MathSciNet  Google Scholar 

  62. Langner O, Dotsch R, Bijlstra G, Wigboldus DH, Hawk ST, Van Knippenberg A (2010) Presentation and validation of the radboud faces database. Cognit Emot 24(8):1377–1388

    Article  Google Scholar 

  63. Kingma DP, Ba J (2014) Adam: A method for stochastic optimization. arXiv:1412.6980

  64. Chen J, Li Y, Du Y, Frey EC (2020) Generating anthropomorphic phantoms using fully unsupervised deformable image registration with convolutional neural networks. Med Phys 47(12):6366–6380

    Article  Google Scholar 

  65. Saad ZS, Glen DR, Chen G, Beauchamp MS, Desai R, Cox RW (2009) A new method for improving functional-to-structural MRI alignment using local pearson correlation. Neuroimage 44(3):839–848

    Article  Google Scholar 

  66. Wang Z, Bovik AC, Sheikh HR, Simoncelli EP (2004) Image quality assessment: from error visibility to structural similarity. IEEE Trans Image process 13(4):600–612

    Article  Google Scholar 

  67. Guo X, Xu Z, Lu Y, Pang Y (2005) An application of fourier-mellin transform in image registration. In: The Fifth international conference on computer and information technology (CIT’05), IEEE, pp 619–623

  68. Chen X, Meng Y, Zhao Y, Williams R, Vallabhaneni SR, Zheng Y (2021) Learning unsupervised parameter-specific affine transformation for medical images registration. In: International conference on medical image computing and computer-assisted intervention, Springer, pp 24–34

  69. Mok TC, Chung AC (2020) Large deformation diffeomorphic image registration with laplacian pyramid networks. In: International conference on medical image computing and computer-assisted intervention, Springer, pp 211–221

  70. Kim B, Han I, Ye JC (2021) Diffusemorph: Unsupervised deformable image registration along continuous trajectory using diffusion models. arXiv:2112.05149

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Chunsheng Guo.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix

Appendix

In the sonar-based spherical coordinate system, the projection model of the sonar is shown in Fig. 9. The FLS projects the 3D point \({\mathbf {P}}\) in the scene onto the 2D image plane, and the projection point is \({\mathbf {P}}_{\mathrm {s}}\). The point \({\mathbf {P}}\) is defined as:

Fig. 9
figure 9

Sonar coordinate system and projection model in mapping a 3D point onto zero-elevation plane

$$\begin{aligned}&{\mathbf {P}}=\left( \begin{array}{l} x \\ y \\ z \end{array}\right) =\left( \begin{array}{c} R \cos \varphi \sin \theta \\ R \cos \varphi \cos \theta \\ R \sin \varphi \end{array}\right) , \end{aligned}$$
(18)
$$\begin{aligned}&\left[ \begin{array}{l} R \\ \theta \\ \varphi \end{array}\right] =\left[ \begin{array}{c} \sqrt{x^{2}+y^{2}+z^{2}} \\ \tan ^{-1}(x / y) \\ \tan ^{-1}\left( z / \sqrt{x^{2}+y^{2}}\right) \end{array}\right] , \end{aligned}$$
(19)

where \(\left[ \begin{array}{lll}x&y&z\end{array}\right] ^{T}\) is the Cartesian coordinates of point \({\mathbf {P}}\) and \(\left[ \begin{array}{lll}R&\theta&\varphi \end{array}\right] ^{T}\) represents the distance, azimuth, and elevation angle of point \({\mathbf {P}}\) in the spherical coordinate system.

The projection point \({\mathbf {P}}_{\mathrm {s}}\) is defined as:

$$\begin{aligned} {\mathbf {P}}_{\mathrm {s}}=\left( \begin{array}{l} x_{s} \\ y_{s} \end{array}\right) =\left[ \begin{array}{l} R \sin \theta \\ R \cos \theta \end{array}\right] =\frac{1}{\cos \varphi }\left[ \begin{array}{l} x \\ y \end{array}\right] . \end{aligned}$$
(20)

Now suppose that the sonar device follows rigid body motion, and then the coordinate points \({\mathbf {P}}\) and \({\mathbf {P}}^{\prime }\) of different views of the same scene satisfy the following transformation relationship:

$$\begin{aligned} {\mathbf {P}}^{\prime }={\mathbf {R}} {\mathbf {P}}+{\mathbf {t}}, \end{aligned}$$
(21)

where \({\mathbf {R}}\) is \(3 \times 3\) 3D rotation matrix and \({\mathbf {t}}\) is the 3D translation vector.

Let \({\mathbf {n}}=\left[ n_{x}, n_{y}, n_{z}\right] ^{T}\) be the scaled normal vector derived from the plane equation \(Z=Z_{o}+\zeta _{x} X+\zeta _{y} Y\), and satisfying \({\mathbf {n}} \cdot {\mathbf {P}}=1\), and then [22]

$$\begin{aligned} {\mathbf {P}}^{\prime }=\left( {\mathbf {R}}+\mathbf {t n}^{T}\right) {\mathbf {P}}={\mathbf {Q}} {\mathbf {P}}, \end{aligned}$$
(22)

where

$$\begin{aligned} {\mathbf {Q}}=\left[ \begin{array}{lll} q_{11} &{} q_{12} &{} q_{13} \\ q_{21} &{} q_{22} &{} q_{23} \\ q_{31} &{} q_{32} &{} q_{33} \end{array}\right] . \end{aligned}$$
(23)

Using Eq.18, we can get

$$\begin{aligned} \left[ \begin{array}{c} R^{\prime } \cos \varphi ^{\prime } \sin \theta ^{\prime } \\ R^{\prime } \cos \varphi ^{\prime } \cos \theta ^{\prime } \\ R^{\prime } \sin \varphi ^{\prime } \end{array}\right] ={\mathbf {Q}}\left[ \begin{array}{c} R \cos \varphi \sin \theta \\ R \cos \varphi \cos \theta \\ R \sin \varphi \end{array}\right] , \end{aligned}$$
(24)

which can be rewritten:

$$\begin{aligned} \left[ \begin{array}{l} R^{\prime } \sin \theta ^{\prime } \\ R^{\prime } \cos \theta ^{\prime } \\ R^{\prime } \tan \varphi ^{\prime } \end{array}\right] =\left( \frac{\cos \varphi }{\cos \varphi ^{\prime }}\right) {\mathbf {Q}}\left[ \begin{array}{l} R \sin \theta \\ R \cos \theta \\ R \tan \varphi \end{array}\right] . \end{aligned}$$
(25)

Using Eq.20, we can get

$$\begin{aligned} \left[ \begin{array}{c} x_{s}^{\prime } \\ y_{s}^{\prime } \\ R^{\prime } \tan \varphi ^{\prime } \end{array}\right] =\left( \frac{\cos \varphi }{\cos \varphi ^{\prime }}\right) {\mathbf {Q}}\left[ \begin{array}{c} x_{s} \\ y_{s} \\ R \tan \varphi \end{array}\right] , \end{aligned}$$
(26)

and then

$$\begin{aligned} \begin{array}{l} x_{s}^{\prime }=\left( \frac{\cos \varphi }{\cos \varphi ^{\prime }}\right) \left[ q_{11} x_{s}+q_{12} y_{s}+q_{13} R \tan \varphi \right] , \\ y_{s}^{\prime }=\left( \frac{\cos \varphi }{\cos \varphi ^{\prime }}\right) \left[ q_{21} x_{s}+q_{22} y_{s}+q_{23} R \tan \varphi \right] , \end{array} \end{aligned}$$
(27)

where the projection point of point \({\mathbf {P}}^{\prime }\) on the 2D image plane is:

$$\begin{aligned} {\mathbf {P}}_{s}^{\prime }=\left( \begin{array}{c} x_{s}^{\prime } \\ y_{s}^{\prime } \end{array}\right) =\left[ \begin{array}{l} R^{\prime } \sin \theta ^{\prime } \\ R^{\prime } \cos \theta ^{\prime } \end{array}\right] =\frac{1}{\cos \varphi ^{\prime }}\left[ \begin{array}{l} x^{\prime } \\ y^{\prime } \end{array}\right] , \end{aligned}$$
(28)

Finally, sonar image points satisfy the transformation [6]:

$$\begin{aligned} \left[ \begin{array}{c} x_{s}^{\prime } \\ y_{s}^{\prime } \\ 1 \end{array}\right] ={\mathbf {H}}\left[ \begin{array}{c} x_{s} \\ y_{s} \\ 1 \end{array}\right] , \end{aligned}$$
(29)

where

$$\begin{aligned} {\mathbf {H}}=\left[ \begin{array}{ccc} \alpha q_{11} &{} \alpha q_{12} &{} \beta q_{13} \\ \alpha q_{21} &{} \alpha q_{22} &{} \beta q_{23} \\ 0 &{} 0 &{} 1 \end{array}\right] ; \alpha =\frac{\cos \varphi }{\cos \varphi ^{\prime }}, \beta =R \frac{\sin \varphi }{\cos \varphi ^{\prime }}. \end{aligned}$$
(30)

Although it appears to be an affine model, the elements of \({\mathbf {H}}\) are different throughout the image due to the dependence on elevation angles \(\varphi \) and \(\varphi ^{\prime }\). The sonar images from two different viewpoints show a complex nonlinear transformation relationship [22].

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Huang, P., Guo, C., Fu, X. et al. GPLFR—Global perspective and local flow registration-for forward-looking sonar images. Neural Comput & Applic 34, 12663–12679 (2022). https://doi.org/10.1007/s00521-022-07113-8

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00521-022-07113-8

Keywords

Navigation