Multimodal Dense Stereo Matching

  • Max MehltretterEmail author
  • Sebastian P. Kleinschmidt
  • Bernardo Wagner
  • Christian Heipke
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11269)


In this paper, we propose a new approach for dense depth estimation based on multimodal stereo images. Our approach employs a combined cost function utilizing robust metrics and a transformation to an illumination independent representation. Additionally, we present a confidence based weighting scheme which allows a pixel-wise weight adjustment within the cost function. We demonstrate the capabilities of our approach using RGB- and thermal images. The resulting depth maps are evaluated by comparing them to depth measurements of a Velodyne HDL-64E LiDAR sensor. We show that our method outperforms current state of the art dense matching methods regarding depth estimation based on multimodal input images.



This work was supported by the German Research Foundation (DFG) as a part of the Research Training Group i.c.sens [GRK2159] and the MOBILISE initiative of the Leibniz Universität Hannover and TU Braunschweig.


  1. 1.
    Alldieck, T., Bahnsen, C.H., Moeslund, T.B.: Context-aware fusion of RGB and thermal imagery for traffic monitoring. Sensors 16(11) (2016). Scholar
  2. 2.
    Bhanu, B., Han, J.: Kinematic-based human motion analysis in infrared sequences. In: Proceedings of the Sixth IEEE Workshop on Applications of Computer Vision (2002)Google Scholar
  3. 3.
    Bulatov, D., Wernerus, P., Heipke, C.: Multi-view dense matching supported by triangular meshes. ISPRS J. Photogramm. Remote Sens. 66(6), 907–918 (2011)CrossRefGoogle Scholar
  4. 4.
    Conaire, C., Cooke, E., O’Connor, N., Murphy, N., Smearson, A.: Background modelling in infrared and visible spectrum video for people tracking. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition - Workshops (2005)Google Scholar
  5. 5.
    Geiger, A., Roser, M., Urtasun, R.: Efficient large-scale stereo matching. In: Kimmel, R., Klette, R., Sugimoto, A. (eds.) ACCV 2010. LNCS, vol. 6492, pp. 25–38. Springer, Heidelberg (2011). Scholar
  6. 6.
    Guo, L., Zhang, G., Wu, J.: Infrared image area correlation matching method based on phase congruency. In: International Conference on Artificial Intelligence and Computational Intelligence (2010)Google Scholar
  7. 7.
    Han, J., Bhanu, B.: Fusion of color and infrared video for moving human detection. Pattern Recognit. 40(6), 1771–1784 (2007). Scholar
  8. 8.
    Heather, J.P., Smith, M.I.: Multimodal image registration with applications to image fusion. In: 7th International Conference on Information Fusion, vol. 1, p. 8 (2005).
  9. 9.
    Hirschmuller, H.: Stereo processing by semiglobal matching and mutual information. IEEE Trans. Pattern Anal. Mach. Intell. 30(2), 328–341 (2008)CrossRefGoogle Scholar
  10. 10.
    Hrkać, T., Kalafatić, Z., Krapac, J.: Infrared-visual image registration based on corners and hausdorff distance. In: Ersbøll, B.K., Pedersen, K.S. (eds.) SCIA 2007. LNCS, vol. 4522, pp. 383–392. Springer, Heidelberg (2007). Scholar
  11. 11.
    Hu, X., Mordohai, P.: A quantitative evaluation of confidence measures for stereo vision. IEEE Trans. Pattern Anal. Mach. Intell. 34(11), 2121–2133 (2012)CrossRefGoogle Scholar
  12. 12.
    Istenic, R., Heric, D., Ribaric, S., Zazula, D.: Thermal and visual image registration in hough parameter space. In: 14th International Workshop on Systems, Signals and Image Processing and 6th EURASIP Conference focused on Speech and Image Processing, Multimedia Communications and Services, pp. 106–109 (2007).
  13. 13.
    James, A.P., Dasarathy, B.V.: Medical image fusion: a survey of the state of the art. CoRR abs/1401.0166 (2014)CrossRefGoogle Scholar
  14. 14.
    Kim, S., Ham, B., Kim, B., Sohn, K.: Mahalanobis distance cross-correlation for illumination-invariant stereo matching. IEEE Trans. Circuits Syst. Video Technol. 24(11), 1844–1859 (2014)CrossRefGoogle Scholar
  15. 15.
    Kleinschmidt, S.P., Wagner, B.: Probabilistic fusion and analysis of multimodal image features. In: 18th International Conference on Advanced Robotics, pp. 498–504 (2017)Google Scholar
  16. 16.
    Kleinschmidt, S.P., Wagner, B.: Spatial fusion of different imaging technologies using a virtual multimodal camera. In: Madani, K., Peaucelle, D., Gusikhin, O. (eds.) Informatics in Control, Automation and Robotics. LNEE, vol. 430, pp. 153–174. Springer, Cham (2018). Scholar
  17. 17.
    Kleinschmidt, S.P., Wagner, B.: Visual multimodal odometry: robust visual odometry in harsh environments. In: IEEE International Symposium on Safety, Security and Rescue Robotics (2018)Google Scholar
  18. 18.
    Kovesi, P.: Image Features from Phase Congruency. Videre: J. Comput. Vis. Res. 1(3), 1–26 (1999)Google Scholar
  19. 19.
    Kovesi, P.: Phase congruency detects corners and edges. In: The Australian Pattern Recognition Society Conference: DICTA, pp. 309–318 (2003)Google Scholar
  20. 20.
    Krotosky, S.J., Trivedi, M.M.: Mutual information based registration of multimodal stereo videos for person tracking. Comput. Vis. Image Underst. 106(2–3), 270–287 (2007). Scholar
  21. 21.
    Leinonen, I., Jones, H.G.: Combining thermal and visible imagery for estimating canopy temperature and identifying plant stress. J. Exp. Bot. 55(401), 1423–1431 (2004)CrossRefGoogle Scholar
  22. 22.
    Lin, S.S.: Review: extending visible band computer vision techniques to infrared band images. Technical report, MS-CIS-01-04, GRASP Laboratory, Computer Vision and Information Science Department, University of Pennsylvania (2001)Google Scholar
  23. 23.
    Maddern, W., Stewart, A., McManus, C., Upcroft, B., Churchill, W., Newman, P.: Illumination invariant imaging: applications in robust vision-based localisation, mapping and classification for autonomous vehicles. In: Proceedings of the IEEE International Conference on Robotics and Automation - Workshop, vol. 2, p. 3 (2014)Google Scholar
  24. 24.
    Mehltretter, M., Heipke, C.: Illumination invariant dense image matching based on sparse features. In: 38. Wissenschaftlich-Technische Jahrestagung der DGPF und PFGK18 Tagung in München, vol. 27, pp. 584–596 (2018)Google Scholar
  25. 25.
    Mouats, T., Aouf, N.: Multimodal stereo correspondence based on phase congruency and edge histogram descriptor. In: Proceedings of the 16th International Conference on Information Fusion, pp. 1981–1987. IEEE (2013)Google Scholar
  26. 26.
    Raza, S., Sanchez, V., Prince, G., Clarkson, J., Rajpoot, N.M.: Registration of thermal and visible light images of diseased plants using silhouette extraction in the wavelet domain. Pattern Recognit. 7(48), 2119–2128 (2015)CrossRefGoogle Scholar
  27. 27.
    Shah, P., Merchant, S.N., Desai, U.B.: Fusion of surveillance images in infrared and visible band using curvelet, wavelet and wavelet packet transform. Int. J. Wavelets, Multiresolution Inf. Process. 8(2), 271–292 (2010)MathSciNetCrossRefGoogle Scholar
  28. 28.
    Vidas, S., Moghadam, P.: HeatWave: a handheld 3D thermography system for energy auditing. Energy Build. 66, 445–460 (2013)CrossRefGoogle Scholar
  29. 29.
    Yamaguchi, K., McAllester, D., Urtasun, R.: Efficient joint segmentation, occlusion labeling, stereo and flow estimation. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8693, pp. 756–771. Springer, Cham (2014). Scholar
  30. 30.
    Zabih, R., Woodfill, J.: Non-parametric local transforms for computing visual correspondence. In: Eklundh, J.-O. (ed.) ECCV 1994. LNCS, vol. 801, pp. 151–158. Springer, Heidelberg (1994). Scholar
  31. 31.
    Zbontar, J., LeCun, Y.: Stereo matching by training a convolutional neural network to compare image patches. J. Mach. Learn. Res. 17(1–32), 2 (2016)zbMATHGoogle Scholar
  32. 32.
    Zhang, Q., Pless, R.: Extrinsic calibration of a camera and laser range finder. In: International Conference on Intelligent Robots and Systems, pp. 2301–2306 (2004)Google Scholar
  33. 33.
    Zhao, J., Cheung, S.S.: Human segmentation by geometrically fusing visible-light and thermal imageries. Multimedia Tools Appl. 73(1), 61–89 (2014). Scholar
  34. 34.
    Zhu, S., Yan, L.: Local stereo matching algorithm with efficient matching cost and adaptive guided image filter. Vis. Comput. 33(9), 1087–1102 (2017)CrossRefGoogle Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  • Max Mehltretter
    • 1
    Email author
  • Sebastian P. Kleinschmidt
    • 2
  • Bernardo Wagner
    • 2
  • Christian Heipke
    • 1
  1. 1.Institute of Photogrammetry and GeoInformationLeibniz Universität HannoverHanoverGermany
  2. 2.Institute of Systems Engineering - Real Time Systems GroupLeibniz Universität HannoverHanoverGermany

Personalised recommendations