Abstract
Image matching is a fundamental task in photogrammetry and computer vision. While effective solutions exist for narrow-baseline viewing conditions, using detectors, e.g., based on differences of Gaussians (DoG) and descriptors such as scale-invariant feature transform (SIFT), it still remains a challenging problem for wide-baseline configurations. This is particularly true when dealing with UAV-based (unmanned aerial vehicle) images together with images taken from the ground. In this paper, we propose a method for wide-baseline image matching that extends the current state-of-the-art approach matching on demand with view synthesis (MODS) in such a way that even more extreme wide-baseline problems can be solved. We achieve this (1) by making use of projective transformations during view synthesis to overcome limitations induced by the approximate character of affine transformations and (2) by estimating the essential matrix within geometric verification to more robustly filter incorrect correspondences in case of a known camera calibration. We have evaluated our approach on several challenging image pairs mainly consisting of UAV-based images together with images taken from the ground and demonstrate improved performance compared to MODS.
Zusammenfassung
Bildzuordnung bei großer Basis mit projektiver Ansichtssynthese und kalibrierter geometrischer Verifikation. Bildzuordnung ist eine grundlegende Aufgabe in Photogrammetrie und Computer Vision. Während für Aufnahmebedingungen mit kleiner Basis wirksame Lösungen existieren, die Detektoren bspw. basierend auf Differenzen von Gauß-Funktionen (DoG) und Deskriptoren wie Scale-Invariant Feature Transform (SIFT) nutzen, bleibt diese Aufgabe für Konfigurationen mit großer Basis nach wie vor eine Herausforderung. Dies gilt insbesondere, wenn man sich mit UAV-basierten (Unmanned Aerial Vehicle) Bildern zusammen mit Bildern, die vom Boden aus aufgenommen wurden, beschäftigt. In diesem Beitrag schlagen wir eine Methode zur Bildzuordnung bei großer Basis vor, die den aktuellen State-of-the-Art-Ansatz Matching on Demand with View Synthesis (MODS) so erweitert, dass noch extremere Probleme mit großer Basis gelöst werden können. Wir erreichen dies (1) durch Verwendung von projektiven Transformationen während der Ansichtssynthese, um Einschränkungen zu überwinden, die durch den approximativen Charakter von affinen Transformationen verursacht werden, und (2) durch Schätzung der essentiellen Matrix innerhalb der geometrischen Verifikation, um bei bekannter Kamerakalibrierung falsche Korrespondenzen robuster zu filtern. Wir haben unseren Ansatz auf mehreren Bildpaaren mit extrem unterschiedlichen Blickrichtungen evaluiert, welche hauptsächlich aus jeweils einem UAV-basierten Bild und einem Bild, das vom Boden aus aufgenommen wurde, bestehen, und demonstrieren eine verbesserte Leistungsfähigkeit unseres Verfahrens im Vergleich zu MODS.
Similar content being viewed by others
References
Arandjelović R, Zisserman A (2012) Three things everyone should know to improve object retrieval. In: 2012 IEEE conference on computer vision and pattern recognition (CVPR), pp 2911–2918. doi:10.1109/CVPR.2012.6248018
Bay H, Ess A, Tuytelaars T, van Gool L (2008) Speeded-up robust features (SURF). Comp Vis Image Underst 110(3):346–359. doi:10.1016/j.cviu.2007.09.014
Cai GR, Jodoin PM, Li SZ, Wu YD, Su SZ, Huang ZK (2013) Perspective-SIFT: an efficient tool for low-altitude remote sensing image registration. Signal Process 93(11):3088–3110. doi:10.1016/j.sigpro.2013.04.008
Calonder M, Lepetit V, Strecha C, Fua P (2010) BRIEF: binary robust independent elementary features. In: Daniilidis K, Maragos P, Paragios N (eds) Computer Vision—ECCV 2010, Lecture Notes in Computer Science, vol 6314. Springer, pp 778–792. doi:10.1007/978-3-642-15561-1_56
Chum O, Matas J (2005) Matching with PROSAC—progressive sample consensus. In: 2005 IEEE conference on computer vision and pattern recognition (CVPR), vol 1, pp 220–226. doi:10.1109/CVPR.2005.221
Chum O, Matas J, Kittler J (2003) Locally optimized RANSAC. In: Goos G, Hartmanis J, van Leeuwen J, Michaelis B, Krell G (eds) Pattern Recognition, Lecture Notes in Computer Science, vol 2781. Springer, Berlin, pp 236–243. doi:10.1007/978-3-540-45243-0_31
Fischler MA, Bolles RC (1981) Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography. Commun ACM 24(6):381–395. doi:10.1145/358669.358692
Hartley R (1997) In defense of the eight-point algorithm. IEEE Trans Pattern Anal Mach Intell 19(6):580–593. doi:10.1109/34.601246
Hartmann W, Havlena M, Schindler K (2016) Recent developments in large-scale tie-point matching. ISPRS J Photogramm Remote Sens 115:47–62. doi:10.1016/j.isprsjprs.2015.09.005
Heckbert P (1986) Survey of texture mapping. IEEE Comput Graph Appl 6(11):56–67. doi:10.1109/MCG.1986.276672
Lebeda K, Matas J, Chum O (2012) Fixing the locally optimized RANSAC. In: Bowden R, Collomosse J, Mikolajczyk K (eds) British machine vision conference 2012, pp 95.1–95.11. doi:10.5244/C.26.95
Lenc K, Matas J, Mishkin D (2014) A few things one should know about feature extraction, description and matching. In: Kúkelová Z, Heller J (eds) CVWW 2014. Czech Society for Cybernetics and Informatics, Czech Pattern Recognition Society Group, Prague, pp 67–74
Liu W, Wang Y, Chen J, Guo J, Lu Y (2012) A completely affine invariant image-matching method based on perspective projection. Mach Vis Appl 23(2):231–242. doi:10.1007/s00138-011-0347-7
Lowe DG (2004) Distinctive image features from scale-invariant keypoints. Int J Comput Vis 60(2):91–110. doi:10.1023/B:VISI.0000029664.99615.94
Matas J, Chum O, Urban M, Pajdla T (2002) Robust wide baseline stereo from maximally stable extremal regions. In: Marshall D, Rosin PL (eds) British machine vision conference 2002, pp 36.1–36.10. doi:10.5244/C.16.36
Mayer H, Bartelsen J, Hirschmüller H, Kuhn A (2012) Dense 3D reconstruction from wide baseline image sets. In: Dellaert F, Frahm JM, Pollefeys M, Leal-Taixé L, Rosenhahn B (eds) Outdoor and large-scale real-world scene analysis, Lecture Notes in Computer Science, vol 7474. Springer, Berlin, pp 285–304. doi:10.1007/978-3-642-34091-8_13
Mikolajczyk K, Schmid C (2002) An affine invariant interest point detector. In: Heyden A, Sparr G, Nielsen M, Johansen P (eds) Computer Vision—ECCV 2002, Lecture Notes in Computer Science, vol 2350. Springer, Berlin, pp 128–142. doi:10.1007/3-540-47969-4_9
Mikolajczyk K, Tuytelaars T, Schmid C, Zisserman A, Matas J, Schaffalitzky F, Kadir T, van Gool L (2005) A comparison of affine region detectors. Int J Comput Vis 65(1–2):43–72. doi:10.1007/s11263-005-3848-x
Mishkin D, Matas J, Perdoch M (2015) MODS: fast and robust method for two-view matching. Comput Vis Image Underst 141:81–93. doi:10.1016/j.cviu.2015.08.005
Moisan L, Stival B (2004) A probabilistic criterion to detect rigid point matches between two images and estimate the fundamental matrix. Int J Comput Vis 57(3):201–218. doi:10.1023/B:VISI.0000013094.38752.54
Moreels P, Perona P (2007) Evaluation of features detectors and descriptors based on 3D objects. Int J Comput Vis 73(3):263–284. doi:10.1007/s11263-006-9967-1
Morel JM, Yu G (2009) ASIFT: a new framework for fully affine invariant image comparison. SIAM J Imaging Sci 2(2):438–469. doi:10.1137/080732730
Nistér D (2004) An efficient solution to the five-point relative pose problem. IEEE Trans Pattern Anal Mach Intell 26(6):756–770. doi:10.1109/TPAMI.2004.17
Rosten E, Drummond T (2006) Machine learning for high-speed corner detection. In: Leonardis A, Bischof H, Pinz A (eds) Computer Vision—ECCV 2006, Lecture Notes in Computer Science, vol 3951. Springer, Berlin, pp 430–443. doi:10.1007/11744023_34
Szeliski R (2011) Computer vision: algorithms and applications. Springer, London. doi:10.1007/978-1-84882-935-0
Torr P, Zisserman A (2000) MLESAC: a new robust estimator with application to estimating image geometry. Comput Vis Image Underst 78(1):138–156. doi:10.1006/cviu.1999.0832
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Roth, L., Kuhn, A. & Mayer, H. Wide-Baseline Image Matching with Projective View Synthesis and Calibrated Geometric Verification. PFG 85, 85–95 (2017). https://doi.org/10.1007/s41064-017-0012-5
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s41064-017-0012-5