Skip to main content

UGQE: Uncertainty Guided Query Expansion

  • Conference paper
  • First Online:
Pattern Recognition and Artificial Intelligence (ICPRAI 2022)

Abstract

Query expansion is a standard technique in image retrieval, which enriches the original query by capturing various features from relevant images and further aggregating these features to create an expanded query. In this work, we present a new framework, which is based on incorporating uncertainty estimation on top of a self attention mechanism during the expansion procedure. An uncertainty network provides added information on the images that are relevant to the query, in order to increase the expressiveness of the expanded query. Experimental results demonstrate that integrating uncertainty information into a transformer network can improve the performance in terms of mean Average Precision (mAP) on standard image retrieval datasets in comparison to existing methods. Moreover, our approach is the first one that incorporates uncertainty in aggregation of information in a query expansion procedure.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    https://github.com/filipradenovic/cnnimageretrieval-pytorch/issues/68.

  2. 2.

    Unfortunately it was not possible to reproduce the performance results reported in the paper. This is also reported by researchers via issues opened in the github repo of the paper.

  3. 3.

    https://github.com/filipradenovic/cnnimageretrieval-pytorch.

  4. 4.

    We will release our codes and pretrained models for the reproducable baseline (LAttQE) as well as the UGQE at the time of publication.

References

  1. Amini, A., Schwarting, W., Soleimany, A., Rus, D.: Deep evidential regression. In: Advances in Neural Information Processing Systems, vol. 33 (2020)

    Google Scholar 

  2. Arandjelovic, R., Gronat, P., Torii, A., Pajdla, T., Sivic, J.: NetVLAD: CNN architecture for weakly supervised place recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2016 (2016)

    Google Scholar 

  3. Arandjelovic, R., Zisserman, A.: Three things everyone should know to improve object retrieval. In: CVPR (2012)

    Google Scholar 

  4. Babenko, A., Slesarev, A., Chigorin, A., Lempitsky, V.: Neural codes for image retrieval. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8689, pp. 584–599. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10590-1_38

    Chapter  Google Scholar 

  5. Brown, T., et al.: Language models are few-shot learners. In: Advances in Neural Information Processing Systems, vol. 33, pp. 1877–1901 (2020)

    Google Scholar 

  6. Buckley, C.: Automatic query expansion using smart: TREC 3. In: Proceedings of the 3rd Text REtrieval Conference (TREC-3), pp. 69–80 (1994)

    Google Scholar 

  7. Chum, O., Philbin, J., Sivic, J., Isard, M., Zisserman, A.: Total recall: automatic query expansion with a generative feature model for object retrieval. In: 2007 IEEE 11th ICCV (2007)

    Google Scholar 

  8. Datar, M., Indyk, P.: Locality-sensitive hashing scheme based on p-stable distributions. In: Proceedings of the 20th Annual Symposium on Computational Geometry, SCG 2004, pp. 253–262. ACM Press (2004)

    Google Scholar 

  9. Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. In: NAACL (2019)

    Google Scholar 

  10. Dosovitskiy, A., et al.: An image is worth 16 \(\times \) 16 words: transformers for image recognition at scale. In: 9th International Conference on Learning Representations, ICLR (2021)

    Google Scholar 

  11. El-Nouby, A., Neverova, N., Laptev, I., Jégou, H.: Training vision transformers for image retrieval. CoRR abs/2102.05644 (2021)

    Google Scholar 

  12. Gal, Y., Ghahramani, Z.: Dropout as a Bayesian approximation: representing model uncertainty in deep learning. In: ICML, pp. 1050–1059. PMLR (2016)

    Google Scholar 

  13. Gordo, A., Almazán, J., Revaud, J., Larlus, D.: End-to-end learning of deep visual representations for image retrieval. Int. J. Comput. Vis. 124(2), 237–254 (2017)

    Article  Google Scholar 

  14. Gordo, A., Radenovic, F., Berg, T.: Attention-based query expansion learning. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12373, pp. 172–188. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58604-1_11

    Chapter  Google Scholar 

  15. Guo, C., Pleiss, G., Sun, Y., Weinberger, K.Q.: On calibration of modern neural networks. In: International Conference on Machine Learning, pp. 1321–1330 (2017)

    Google Scholar 

  16. Jégou, H., Douze, M., Schmid, C., Pérez, P.: Aggregating local descriptors into a compact image representation. In: 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 3304–3311 (2010)

    Google Scholar 

  17. Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. arXiv:1412.6980 (2014)

  18. Lakshminarayanan, B., Pritzel, A., Blundell, C.: Simple and scalable predictive uncertainty estimation using deep ensembles. In: Advances in Neural Information Processing Systems, vol. 30 (2017)

    Google Scholar 

  19. Liu, X., Duh, K., Liu, L., Gao, J.: Very deep transformers for neural machine translation. arXiv preprint arXiv:2008.07772 (2020)

  20. Lowe, D.: Object recognition from local scale-invariant features. In: Proceedings of the 7th IEEE ICCV (1999)

    Google Scholar 

  21. Malinin, A., Gales, M.: Predictive uncertainty estimation via prior networks. arXiv preprint arXiv:1802.10501 (2018)

  22. Noh, H., Araujo, A., Sim, J., Weyand, T., Han, B.: Large-scale image retrieval with attentive deep local features. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV), October 2017 (2017)

    Google Scholar 

  23. Perronnin, F., Sánchez, J., Mensink, T.: Improving the fisher kernel for large-scale image classification. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010. LNCS, vol. 6314, pp. 143–156. Springer, Heidelberg (2010). https://doi.org/10.1007/978-3-642-15561-1_11

    Chapter  Google Scholar 

  24. Radenovic, F., Tolias, G., Chum, O.: Fine-tuning CNN image retrieval with no human annotation. IEEE Trans. Pattern Anal. Mach. Intell. 41(7), 1655–1668 (2019)

    Article  Google Scholar 

  25. Radenović, F., Iscen, A., Tolias, G., Avrithis, Y., Chum, O.: Revisiting Oxford and Paris: large-scale image retrieval benchmarking. In: CVPR (2018)

    Google Scholar 

  26. Salton, G., Buckley, C.: Improving retrieval performance by relevance feedback. J. Am. Soc. Inf. Sci. 41(4), 288–297 (1990)

    Article  Google Scholar 

  27. Seidenschwarz, J., Elezi, I., Leal-Taixé, L.: Learning intra-batch connections for deep metric learning. In: 38th International Conference on Machine Learning (ICML) (2021)

    Google Scholar 

  28. Sensoy, M., Kaplan, L., Kandemir, M.: Evidential deep learning to quantify classification uncertainty. In: Advances in Neural Information Processing Systems (2018)

    Google Scholar 

  29. Tolias, G., Jégou, H.: Visual query expansion with or without geometry: refining local descriptors by feature aggregation. Pattern Recogn. 47(10), 3466–3476 (2014)

    Article  Google Scholar 

  30. Tolias, G., Sicre, R., Jégou, H.: Particular object retrieval with integral max-pooling of CNN activations. In: Bengio, Y., LeCun, Y. (eds.) 4th International Conference on Learning Representations, ICLR (2016)

    Google Scholar 

  31. Turcot, T., Lowe, D.G.: Better matching with fewer features: the selection of useful features in large database recognition problems. In: ICCV Workshop (2009)

    Google Scholar 

  32. Vaswani, A., et al.: Attention is all you need. In: NeurIPS (2017)

    Google Scholar 

  33. Veličković, P., Cucurull, G., Casanova, A., Romero, A., Liò, P., Bengio, Y.: Graph attention networks. In: International Conference on Learning Representations (2018)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Firat Oncel .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Oncel, F., Aygün, M., Baykal, G., Unal, G. (2022). UGQE: Uncertainty Guided Query Expansion. In: El Yacoubi, M., Granger, E., Yuen, P.C., Pal, U., Vincent, N. (eds) Pattern Recognition and Artificial Intelligence. ICPRAI 2022. Lecture Notes in Computer Science, vol 13363. Springer, Cham. https://doi.org/10.1007/978-3-031-09037-0_10

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-09037-0_10

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-09036-3

  • Online ISBN: 978-3-031-09037-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics