Multimodal Image Alignment Through a Multiscale Chain of Neural Networks with Application to Remote Sensing

  • Armand Zampieri
  • Guillaume CharpiatEmail author
  • Nicolas Girard
  • Yuliya Tarabalka
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11220)


We tackle here the problem of multimodal image non-rigid registration, which is of prime importance in remote sensing and medical imaging. The difficulties encountered by classical registration approaches include feature design and slow optimization by gradient descent. By analyzing these methods, we note the significance of the notion of scale. We design easy-to-train, fully-convolutional neural networks able to learn scale-specific features. Once chained appropriately, they perform global registration in linear time, getting rid of gradient descent schemes by predicting directly the deformation. We show their performance in terms of quality and speed through various tasks of remote sensing multimodal image alignment. In particular, we are able to register correctly cadastral maps of buildings as well as road polylines onto RGB images, and outperform current keypoint matching methods.


Multimodal Alignment Registration Remote sensing 



This work benefited from the support of the project EPITOME ANR-17-CE23-0009 of the French National Research Agency (ANR).

Supplementary material

474218_1_En_40_MOESM1_ESM.pdf (11.4 mb)
Supplementary material 1 (pdf 11643 KB)


  1. 1.
    Beg, M.F., Miller, M.I., Trouvé, A., Younes, L.: Computing large deformation metric mappings via geodesic flows of diffeomorphisms. Int. J. Comput. Vis. 61(2), 139–157 (2005)CrossRefGoogle Scholar
  2. 2.
    Bischke, B., Helber, P., Folz, J., Borth, D., Dengel, A.: Multi-task learning for segmentation of building footprints with deep neural networks. arXiv preprint arXiv:1709.05932 (2017)
  3. 3.
    Bromley, J., Guyon, I., LeCun, Y., Säckinger, E., Shah, R.: Signature verification using a" siamese" time delay neural network. In: Advances in Neural Information Processing Systems, pp. 737–744 (1994)Google Scholar
  4. 4.
    Charpiat, G., Keriven, R., Faugeras, O.: Image statistics based on diffeomorphic matching. In: ICCV’05, vol. 1, pp. 852–857Google Scholar
  5. 5.
    Charpiat, G., Maurel, P., Pons, J.P., Keriven, R., Faugeras, O.: Generalized gradients: priors on minimization flows. Int. J. Comput. Vis. (2007)Google Scholar
  6. 6.
    Chopra, S., Hadsell, R., LeCun, Y.: Learning a similarity metric discriminatively, with application to face verification. In: CVPR’05, vol. 1, pp. 539–546Google Scholar
  7. 7.
    Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: CVPR’05, vol. 1, pp. 886–893.
  8. 8.
    Geiger, A., Lenz, P., Stiller, C., Urtasun, R.: Vision meets robotics: the kitti dataset. Int. J. Robot. Res. 32(11), 1231–1237 (2013)CrossRefGoogle Scholar
  9. 9.
    Glaunes, J., Trouvé, A., Younes, L.: Diffeomorphic matching of distributions: a new approach for unlabelled point-sets and sub-manifolds matching. In: CVPR’04, vol. 2, pp. II–II. IEEEGoogle Scholar
  10. 10.
    Haklay, M., Weber, P.: Openstreetmap: user-generated street maps. IEEE Pervasive Comput. 7(4), 12–18 (2008)CrossRefGoogle Scholar
  11. 11.
    Hansen, M.C., et al.: High-resolution global maps of 21st-century forest cover change. Science 342(6160), 850–853 (2013)CrossRefGoogle Scholar
  12. 12.
    He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. CoRR abs/1512.03385, (2015)
  13. 13.
    Hermosillo, G., Chefd’Hotel, C., Faugeras, O.: Variational methods for multimodal image matching. Int. J. Comput. Vis. 50(3), 329–343 (2002)CrossRefGoogle Scholar
  14. 14.
    Ilg, E., Mayer, N., Saikia, T., Keuper, M., Dosovitskiy, A., Brox, T.: Flownet 2.0: evolution of optical flow estimation with deep networks. arXiv preprint arXiv:1612.01925 (2016)
  15. 15.
    Kendall, D.G.: A survey of the statistical theory of shape. Stat. Sci. 87–99 (1989)MathSciNetCrossRefGoogle Scholar
  16. 16.
    Ding, Z., Fleishman, G., Yang, X., Thompson, P., Kwitt, R., Niethammer, M.: Fast predictive simple geodesic regression. In: Cardoso, M.J., Arbel, T., Carneiro, G., Syeda-Mahmood, T., Tavares, J.M.R.S., Moradi, M., Bradley, A., Greenspan, H., Papa, J.P., Madabhushi, A., Nascimento, J.C., Cardoso, J.S., Belagiannis, V., Lu, Z. (eds.) DLMIA/ML-CDS -2017. LNCS, vol. 10553, pp. 267–275. Springer, Cham (2017). Scholar
  17. 17.
    Lee, D., Hofmann, M., Steinke, F., Altun, Y., Cahill, N.D., Scholkopf, B.: Learning similarity measure for multi-modal 3d image registration. In: CVPR’09, pp. 186–193Google Scholar
  18. 18.
    Maggiori, E., Charpiat, G., Tarabalka, Y., Alliez, P.: Recurrent neural networks to correct satellite image classification maps. IEEE Trans. Geosci. Remote Sens. 55(9), 4962–4971 (2017). Scholar
  19. 19.
    Maggiori, E., Tarabalka, Y., Charpiat, G., Alliez, P.: Can semantic labeling methods generalize to any city? the inria aerial image labeling benchmark. In: IGARSS’07Google Scholar
  20. 20.
    Maintz, J.B.A., van den Elsen, P.A., Viergever, M.A.: Evaluation of ridge seeking operators for multimodality medical image matching. IEEE Trans. Pattern Anal. Mach. Intell. 18(4), 353–365 (1996)CrossRefGoogle Scholar
  21. 21.
    Máttyus, G., Wang, S., Fidler, S., Urtasun, R.: Hd maps: fine-grained road segmentation by parsing ground and aerial images. In: CVPR’16, pp. 3611–3619Google Scholar
  22. 22.
    Meinhardt, T., Möller, M., Hazirbas, C., Cremers, D.: Learning proximal operators: using denoising networks for regularizing inverse imaging problems. In: ICCV’17Google Scholar
  23. 23.
    Menze, M., Geiger, A.: Object scene flow for autonomous vehicles. In: CVPR’15Google Scholar
  24. 24.
    Menze, M., Heipke, C., Geiger, A.: Joint 3d estimation of vehicles and scene flow. In: ISPRS Workshop on Image Sequence Analysis (ISA) (2015)CrossRefGoogle Scholar
  25. 25.
    Merkle, N., Luo, W., Auer, S., Mller, R., Urtasun, R.: Exploiting deep matching and sar data for the geo-localization accuracy improvement of optical satellite images. Remote Sens. 9(6) (2017)., Scholar
  26. 26.
    Michor, P.W., Mumford, D., Shah, J., Younes, L.: A metric on shape space with explicit geodesics. arXiv preprint arXiv:0706.4299 (2007)
  27. 27.
    Mikolajczyk, K., Schmid, C.: Scale & affine invariant interest point detectors. Int. J. Comput. Vis. 60(1), 63–86 (2004)CrossRefGoogle Scholar
  28. 28.
    Ranjan, A., Black, M.J.: Optical flow estimation using a spatial pyramid network. In: CVPR’17, vol. 2Google Scholar
  29. 29.
    Rocco, I., Arandjelovic, R., Sivic, J.: Convolutional neural network architecture for geometric matching. CoRR abs/1703.05593, (2017)
  30. 30.
    Rohé, M.M., Datar, M., Heimann, T., Sermesant, M., Pennec, X.: SVF-Net: learning deformable image registration using shape matching. In: Descoteaux, M., Maier-Hein, L., Franz, A., Jannin, P., Collins, D.L., Duchesne, S. (eds.) MICCAI 2017. LNCS, vol. 10433, pp. 266–274. Springer, Cham (2017). Scholar
  31. 31.
    Schnabel, J.A., et al.: A generic framework for non-rigid registration based on non-uniform multi-level free-form deformations. In: Niessen, W.J., Viergever, M.A. (eds.) MICCAI 2001. LNCS, vol. 2208, pp. 573–581. Springer, Heidelberg (2001). Scholar
  32. 32.
    Scovanner, P., Ali, S., Shah, M.: A 3-dimensional sift descriptor and its application to action recognition. In: Proceedings of the 15th ACM International Conference on Multimedia, pp. 357–360. MM ’07, ACM, New York, NY, USA., (2007)
  33. 33.
    Sotiras, A., Davatzikos, C., Paragios, N.: Deformable medical image registration: a survey. IEEE Trans. Med. Imag. 32(7), 1153–1190 (2013)CrossRefGoogle Scholar
  34. 34.
    Sundaramoorthi, G., Yezzi, A., Mennucci, A.: Coarse-to-fine segmentation and tracking using sobolev active contours. IEEE Trans. Pattern Anal. Mach. Intell. 30(5), 851–864 (2008)CrossRefGoogle Scholar
  35. 35.
    Verdié, Y., Lafarge, F.: Efficient Monte Carlo sampler for detecting parametric objects in large scenes. In: Fitzgibbon, Andrew, Lazebnik, Svetlana, Perona, Pietro, Sato, Yoichi, Schmid, Cordelia (eds.) ECCV 2012. LNCS, vol. 7574, pp. 539–552. Springer, Heidelberg (2012). Scholar
  36. 36.
    Von Eicken, T., Basu, A., Buch, V., Vogels, W.: U-net: a user-level network interface for parallel and distributed computing. ACM SIGOPS Oper. Syst. Rev. 29, 40–53. ACM (1995)Google Scholar
  37. 37.
    Wang, S., Fidler, S., Urtasun, R.: Proximal deep structured models. In: Advances in Neural Information Processing Systems. pp. 865–873 (2016)Google Scholar
  38. 38.
    Weinzaepfel, P., Revaud, J., Harchaoui, Z., Schmid, C.: DeepFlow: Large displacement optical flow with deep matching. In: ICCV. (2013)
  39. 39.
    Yang, X., Kwitt, R., Niethammer, M.: Quicksilver: fast predictive image registration—a deep learning approach. CoRR abs/1703.10908 (2017)Google Scholar
  40. 40.
    Ye, Y., Shan, J.: A local descriptor based registration method for multispectral remote sensing images with non-linear intensity differences. ISPRS J. Photogramm. Remote Sens. 90, 83–95 (2014)CrossRefGoogle Scholar
  41. 41.
    Ye, Y., Shan, J., Bruzzone, L., Shen, L.: Robust registration of multimodal remote sensing images based on structural similarity. IEEE Trans. Geosci. Remote Sens. 55(5), 2941–2958 (2017)CrossRefGoogle Scholar
  42. 42.
    Ye, Y., Shen, L.: Hopc: a novel similarity metric based on geometric structural properties for multi-modal remote sensing image matching. In: Proceedings of the Annals Photogrammetry Remote Sensing Spatial Information Science (ISPRS), pp. 9–16 (2016)CrossRefGoogle Scholar
  43. 43.
    Yu, L., Zhang, D., Holden, E.J.: A fast and fully automatic registration approach based on point features for multi-source remote-sensing images. Comput. Geosci. 34(7), 838–848 (2008)CrossRefGoogle Scholar
  44. 44.
    Zbontar, J., LeCun, Y.: Stereo matching by training a convolutional neural network to compare image patches. CoRR abs/1510.05970 (2015),

Copyright information

© Springer Nature Switzerland AG 2018

Authors and Affiliations

  • Armand Zampieri
    • 1
  • Guillaume Charpiat
    • 2
    Email author
  • Nicolas Girard
    • 1
  • Yuliya Tarabalka
    • 1
  1. 1.TITANE team, INRIAUniversité Côte d’AzurNiceFrance
  2. 2.TAU teamINRIA, LRI, Université Paris-SudOrsayFrance

Personalised recommendations