Abstract
This paper describes a method for feature-based matching which offers very fast runtime performance due to the simple quantised patches used for matching and a tree-based lookup scheme which prevents the need for exhaustively comparing each query patch against the entire feature database. The method enables seven independently moving targets in a test sequence to be localised in an average total processing time of 6.03 ms per frame.
A training phase is employed to identify the most repeatable features from a particular range of viewpoints and to learn a model for the patches corresponding to each feature. Feature models consist of independent histograms of quantised intensity for each pixel in the patch, which we refer to as Histogrammed Intensity Patches (HIPs). The histogram values are thresholded and the feature model is stored in a compact binary representation which requires under 60 bytes of memory per feature and permits the rapid computation of a matching score using bitwise operations.
The method achieves better matching robustness than the state-of-the-art fast localisation schemes introduced by Wagner et al. (IEEE International Symposium on Mixed and Augmented Reality, 2008). Additionally both the runtime memory usage and computation time are reduced by a factor of more than four.
Similar content being viewed by others
References
Ballard, D. H. (1987). Generalizing the hough transform to detect arbitrary shapes. Readings in Computer Vision: Issues, Problems, Principles, and Paradigms, 1, 714–725.
Bay, H., Ess, A., Tuytelaars, T., & Gool, L. V. (2008) Surf: speeded up robust features. Computer Vision and Image Understanding, 110(3), 346–359.
Brown, M., Szeliski, R., & Winder, S. (2005). Multi-image matching using multi-scale oriented patches. In IEEE computer, society conference on computer vision and pattern recognition (pp. 510–517).
Chum, O., & Matas, J. (2005). Matching with PROSAC—progressive sample consensus. In IEEE computer, society conference on computer vision and pattern recognition (pp. 220–226).
Croes, G. A. (1958). A method for solving traveling-salesman problems. Operations Research, 6(6), 791–812.
Fischler, M. A., & Bolles, R. C. (1981). Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography. Communications of the ACM, 24(6), 381–395.
Harris, C., & Stephens, M. (1988). A combined corner and edge detector. In 4th ALVEY vision conference (pp. 147–151).
Heikkilä, M., Pietikäinen, M., & Schmid, C. (2009). Description of interest regions with local binary patterns. Pattern Recognition, 42(3), 425–436.
Hinterstoisser, S., Benhimane, S., Lepetit, V., Fua, P., & Navab, N. (2008). Simultaneous recognition and homography extraction of local patches with a simple linear classifier. In British machine vision conference.
Lepetit, V., & Fua, P. (2006). Keypoint recognition using randomized trees. IEEE Transactions on Pattern Analysis and Machine Intelligence, 28(9), 1465–1479.
Lowe, D. G. (2004). Distinctive image features from scale-invariant keypoints. International Journal of Computer Vision, 2, 91–110.
Matas, J., Chum, O., Urbana, M., & Pajdlaa, T. (2004). Robust wide-baseline stereo from maximally stable extremal regions. Image and Vision Computing, 22(10), 761–767.
McIlroy, P., Rosten, E., Taylor, S., & Drummond, T. (2010). Deterministic sample consensus with multiple match hypotheses. In British machine vision conference
Mikolajczyk, K., & Schmid, C. (2001). Indexing based on scale invariant interest points. In IEEE international conference on computer vision (pp. 525–531).
Mikolajczyk, K., & Schmid, C. (2002). An affine invariant interest point detector. In European conference on computer vision (pp. 128–142).
Mikolajczyk, K., & Schmid, C. (2005). A performance evaluation of local descriptors. IEEE Transactions on Pattern Analysis and Machine Intelligence, 27(10), 1615–1630.
Moravec, H. (1981). Rover visual obstacle avoidance. In International joint conference on artificial intelligence (pp. 785–790).
Morell, J., & Yu, G. (2009). ASIFT: a new framework for fully affine invariant image comparison. SIAM Journal on Imaging Sciences, 2(2), 438–469.
Ozuysal, M., Fua, P., & Lepetit, V. (2007). Fast keypoint recognition in ten lines of code. In IEEE computer society conference on computer vision and pattern recognition.
Rosten, E., & Drummond, T. (2006). Machine learning for high speed corner detection. In European conference on computer vision (pp. 430–443).
Schmid, C., & Mohr, R. (1997). Local greyvalue invariants for image retrieval. IEEE Transactions on Pattern Analysis and Machine Intelligence, 19, 530–535.
Taylor, S., & Drummond, T. (2009). Multiple target localisation at over 100 FPS. In British machine vision conference.
Taylor, S., Rosten, E., & Drummond, T. (2009). Robust feature matching in 2.3 μs. In IEEE CVPR workshop on feature detectors and descriptors: the state of the art and beyond.
Wagner, D., Reitmayr, G., Mulloni, A., Drummond, T., & Schmalstieg, D. (2008). Pose tracking from natural features on mobile phones. In IEEE international symposium on mixed and augmented reality.
Winder, S. A., & Brown, M. (2007). Learning local image descriptors. In IEEE computer society conference on computer vision and pattern recognition.
Author information
Authors and Affiliations
Corresponding author
Additional information
This work was supported by The Boeing Company.
Electronic Supplementary Material
Rights and permissions
About this article
Cite this article
Taylor, S., Drummond, T. Binary Histogrammed Intensity Patches for Efficient and Robust Matching. Int J Comput Vis 94, 241–265 (2011). https://doi.org/10.1007/s11263-011-0430-6
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11263-011-0430-6