Abstract
Traditional hand-crafted features for representing local image patches are evolving into current data-driven and learning-based image feature, but learning a robust and discriminative descriptor which is capable of controlling various patch-level computer vision tasks is still an open problem. In this work, we propose a novel deep convolutional neural network (CNN) to learn local feature descriptors. We utilize the quadruplets with positive and negative training samples, together with a constraint to restrict the intra-class variance, to learn good discriminative CNN representations. Compared with previous works, our model reduces the overlap in feature space between corresponding and non-corresponding patch pairs, and mitigates margin varying problem caused by commonly used triplet loss. We demonstrate that our method achieves better embedding result than some latest works, like PN-Net and TN-TG, on benchmark dataset.
Similar content being viewed by others
References
N. Molton, A. J. Davison and I, Reid, Locally Planar Patch Features for Real-Time Structure from Motion, British Machine Vision Conference, 1 (2004).
S. M. Seitz, B. Curless, J. Diebel, D. Scharstein and R. Szeliski, A Comparison and Evaluation of Multi-View Stereo Reconstruction Algorithms, IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 519 (2006).
R. Szeliski, Foundations and Trends in Computer Graphics and Vision 2, 1 (2006).
D. G. Lowe, International Journal of Computer Vision 60, 91 (2004).
H. Bay, A. Ess, T. Tuytelaars and L. Van Gool, Computer Vision and Image Understanding 110, 346 (2008).
E. Simo-Serra, E. Trulls, L. Ferraz, I, Kokkinos, P, Fna and F. Moreno-Noguer, Discriminative Learning of Deep Convolutional Feature Point Descriptors, IEEE International Conference on Computer Vision, 118 (2015).
S. Zagoruyko and N. Komodakis, Learning to Compare Image Patches via Convolutional Neural Networks, IEEE Conference on Computer Vision and Pattern Recognition, 4353 (2015).
V. Balntas, E. Johns, L. Tang and K. Mikolajczyk, PN-Net: Conjoined Triple Deep Network for Learning Local Image Descriptors, arXiv: 1601.05030, (2016).
B. G. V. Kumar, G. Carneiro and I. Reid, Learning Local Image Descriptors with Deep Siamese and Triplet Convolutional Networks by minimising global loss Functions, IEEE Conference on Computer Vision and Pattern Recognition, 5385 (2016).
Yangqing Jia, Evan Shelhamer, Jeff Donahue, Sergey Karayev, Jonathan Long, Ross Girshick, Sergio Guadarrama and Trevor Darrell, Caffe: Convolutional Architecture for Fast Feature Embedding, arXiv: 1408.5093, (2014).
M. Brown, G. Hua and S. Winder, IEEE Transactions on Pattern Analysis and Machine Intelligence 33, 43 (2011).
K. Simonyan, A. Vedaldi and A. Zisserman, IEEE Transactions on Pattern Analysis and Machine Intelligence 36,1573 (2014).
Author information
Authors and Affiliations
Corresponding author
Additional information
This work has been supported by the Natural Science Foundation of Zhejiang Province (No.Y16F020023). This paper was presented in part at the CCF Chinese Conference on Computer Vision, Tianjin, 2017. This paper was recommended by the program committee.
Rights and permissions
About this article
Cite this article
Zhang, Dl., Zhao, L., Xu, Dq. et al. Discriminatively learning for representing local image features with quadruplet model. Optoelectron. Lett. 13, 462–465 (2017). https://doi.org/10.1007/s11801-017-7198-z
Received:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11801-017-7198-z