Technologies for Visual Localization and Augmented Reality in Smart Cities

Amato, Giuseppe; Cardillo, Franco Alberto; Falchi, Fabrizio

doi:10.1007/978-3-319-50518-3_20

Giuseppe Amato⁵,
Franco Alberto Cardillo⁵ &
Fabrizio Falchi⁵

Part of the book series: Geotechnologies and the Environment ((GEOTECH,volume 16))

1507 Accesses
1 Citations
1 Altmetric

Abstract

The widespread diffusion of smart devices, such as smartphones and tablets, and the new emerging trend of wearable devices, such as smart glasses and smart watches, has pushed forward the development of applications where the user can interact relying on his or her position and field of view. In this way, users can also receive additional information in augmented reality, that is, seeing the information through the smart device, overlaid on top of the real scene. The GPS or the compass can be used to localize the user when augmented reality has to be provided with scenes of large size, for instance, squares or large buildings. However, when augmented reality has to be offered for enriching the view of small objects or small details of larger objects, for instance, statues, paintings, or epigraphs, a more precise positioning is needed. Visual object recognition and tracking technologies offer very detailed and fine-grained positioning capabilities. This chapter discusses the techniques enabling a precise positioning of the user and the subsequent experience in augmented reality, focusing on algorithms for image matching and homography estimation between the images seen by smart devices and images representing objects of interest.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Institutional subscriptions

Notes

1.
CPU: central processing unit; GPU: graphics processing unit.
2.
http://artoolkit.org/
3.
http://www.openscenegraph.org/
4.
https://unity3d.com/
5.
https://www.qualcomm.com/products/vuforia
6.
http://www.t-immersion.com/products/dfusion-suite
7.
http://www.wikitude.com/blog/dev2dev/
8.
http://vcg.isti.cnr.it/vcglib/
9.
https://libgdx.badlogicgames.com/
10.
http://opencv.org/
11.
http://www.cs.cornell.edu/~snavely/bundler/
12.
https://photosynth.net/

References

Akusok A, Miche Y, Karhunen J, Bjork KM, Nian R, Lendasse A (2015) Arbitrary category classification of websites based on image content. IEEE Comput Intell Mag 10(2):30–41. 10.1109/MCI.2015.2405317
Article Google Scholar
Amato G, Bolettieri P, Falchi F, Gennaro C (2013) Large scale image retrieval using vector of locally aggregated descriptors. In: Similarity search and applications. Springer, Heidelberg, pp 245–256
Chapter Google Scholar
Amato G, Falchi F, Claudio G (2015) Fast image classification for monument recognition. J Comput Cult Herit 8(4):18:1–18:25. http://dl.acm.org/citation.cfm?id=2724727
Amato G, Falchi F, Gennaro C (2011a) Geometric consistency checks for KNN based image classification relying on local features. In: Proceedings of the fourth international conference on SImilarity Search and APplications (SISAP’11). ACM, New York, pp 81–88. 10.1145/1995412.1995428. http://doi.acm.org/10.1145/1995412.1995428
Amato G, Falchi F, Gennaro C (2011b) Geometric consistency checks for KNN based image classification relying on local features. In: Proceedings of the fourth international conference on SImilarity Search and APplications. ACM, New York, pp 81–88
Google Scholar
Azuma RT et al (1997) A survey of augmented reality. Presence 6(4):355–385
Google Scholar
Bay H, Tuytelaars T, Van Gool L (2006) Surf: speeded up robust features. In: Leonardis A, Bischof H, Pinz A (eds) Computer vision – ECCV 2006. Lecture notes in computer science, vol 3951. Springer, Berlin/Heidelberg, pp 404–417
Google Scholar
Bergamasco F, Albarelli A, Rodola E, Torsello A (2011) Rune-tag: a high accuracy fiducial marker with strong occlusion resilience. In: IEEE conference on computer vision and pattern recognition (CVPR 2011). IEEE, pp 113–120
Google Scholar
Bhattacharya P, Gavrilova M (2013) A survey of landmark recognition using the bag-of-words framework. In: Intelligent computer graphics 2012. Springer, pp 243–263
Google Scholar
Chum O, Philbin J, Sivic J, Isard M, Zisserman A (2007) Total recall: automatic query expansion with a generative feature model for object retrieval. In: IEEE 11th international conference on computer vision (ICCV 2007). IEEE, pp 1–8
Google Scholar
Crandall DJ, Backstrom L, Huttenlocher D, Kleinberg J (2009) Mapping the world’s photos. In: Proceedings of the 18th international conference on world wide web. ACM, pp 761–770
Google Scholar
Datar M, Immorlica N, Indyk P, Mirrokni VS (2004) Locality-sensitive hashing scheme based on p-stable distributions. In: Proceedings of the twentieth annual symposium on computational geometry. ACM, pp 253–262
Google Scholar
Datta R, Li J, Wang JZ (2005) Content-based image retrieval: approaches and trends of the new age. In: Proceedings of the 7th ACM SIGMM international workshop on multimedia information retrieval (MIR’05). ACM, New York, pp 253–262
Google Scholar
Delhumeau J, Gosselin PH, Jégou H, Pérez P (2013) Revisiting the VLAD image representation. In: Proceedings of the 21st ACM international conference on multimedia. ACM, pp 653–656
Google Scholar
Fischler MA, Bolles RC (1981) Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography. Commun ACM 24(6):381–395
Article Google Scholar
Hartley R, Zisserman A (2003) Multiple view geometry in computer vision. Cambridge University Press, Cambridge/New York
Google Scholar
Heinly J, Dunn E, Frahm JM (2012) Comparative evaluation of binary features. In: Computer vision–ECCV 2012. Springer, pp 759–773
Google Scholar
Jegou H, Douze M, Schmid C (2008) Hamming embedding and weak geometric consistency for large scale image search. In: Computer vision–ECCV 2008. Springer, pp 304–317
Google Scholar
Jégou H, Douze M, Schmid C, Pérez P (2010) Aggregating local descriptors into a compact image representation. In: IEEE conference on computer vision & pattern recognition, IEEE Computer Society, Washington DC, http://ieeexplore.ieee.org/document/5540039/
Kumar DS, Jawahar C (2006) Robust homography-based control for camera positioning in piecewise planar environments. In: Computer vision, graphics and image processing. Springer, pp 906–918
Google Scholar
Lazebnik S, Schmid C, Ponce J (2006) Beyond bags of features: spatial pyramid matching for recognizing natural scene categories. In: IEEE computer society conference on computer vision and pattern recognition 2006, vol 2. IEEE, pp 2169–2178
Google Scholar
Leutenegger S, Chli M, Siegwart R (2011) Brisk: binary robust invariant scalable keypoints. In: IEEE international conference on computer vision (ICCV 2011). IEEE, pp 2548–2555
Google Scholar
Lowe D (1999) Object recognition from local scale-invariant features. In: The proceedings of the seventh IEEE international conference on computer vision 1999, vol 2. IEEE, pp 1150–1157
Google Scholar
Lucas BD, Kanade T et al (1981) An iterative image registration technique with an application to stereo vision. In: IJCAI, IEEE Computer Society, Washington DC, vol 81. pp 674–679. http://ieeexplore.ieee.org/document/5540039/
Novak D, Batko M, Zezula P (to appear) Large-scale image retrieval using neural net descriptors. In: 38th ACM SIGIR international conference of research and development on information retrieval. Springer
Google Scholar
Novak D, Zezula P (2014) Rank aggregation of candidate sets for efficient similarity search. In: Database and expert systems applications. Springer, pp 42–58
Google Scholar
Perronnin F, Dance C (2007) Fisher kernels on visual vocabularies for image categorization. In: IEEE conference on computer vision and pattern recognition (CVPR’07). IEEE, pp 1–8
Google Scholar
Philbin J, Chum O, Isard M, Sivic J, Zisserman A (2007) Object retrieval with large vocabularies and fast spatial matching. In: IEEE conference on computer vision and pattern recognition (CVPR’07). IEEE, pp 1–8
Google Scholar
Razavian AS, Azizpour H, Sullivan J, Carlsson S (2014) CNN features off-the-shelf: an astounding baseline for recognition. In: IEEE conference on computer vision and pattern recognition workshops (CVPRW 2014). IEEE, pp 512–519
Google Scholar
Rublee E, Rabaud V, Konolige K, Bradski G (2011) Orb: an efficient alternative to sift or surf. In: IEEE international conference on computer vision (ICCV 2011). IEEE, pp 2564–2571
Google Scholar
Schroth G, Huitl R, Chen D, Abu-Alqumsan M, Al-Nuaimi A, Steinbach E (2011) Mobile visual location recognition. IEEE Signal Process Mag 28(4):77–89
Article Google Scholar
Shi J, Tomasi C (1994) Good features to track. In: IEEE computer society conference on computer vision and pattern recognition (CVPR 1994). IEEE, pp 593–600
Google Scholar
Sivic J, Zisserman A (2003) Video Google: a text retrieval approach to object matching in videos. In: Proceedings of the ninth IEEE international conference on computer vision (ICCV’03), vol 2, pp 1470. IEEE Computer Society, Washington, DC
Google Scholar
Snavely N, Seitz SM, Szeliski R (2006) Photo tourism: exploring photo collections in 3D. In: ACM transactions on graphics (TOG), vol 25. ACM, pp 835–846
Google Scholar
Tolias G, Bursuc A, Furon T, Jégou H (2015) Rotation and translation covariant match kernels for image retrieval. Comput Vis Image Underst 140:9–20
Article Google Scholar
Weyand T, Leibe B (2015) Visual landmark recognition from internet photo collections: a large-scale evaluation. Comput Vis Image Underst 135:1–15
Article Google Scholar
Yang X, Cheng KT (2014) Local difference binary for ultrafast and distinctive feature description. IEEE Trans Pattern Anal Mach Intell 36(1):188–194
Article Google Scholar
Zhang Z (2000) A flexible new technique for camera calibration. IEEE Trans Pattern Anal Mach Intell 22(11):1330–1334
Article Google Scholar

Download references

Author information

Authors and Affiliations

CNR-ISTI, Pisa, Italy
Giuseppe Amato, Franco Alberto Cardillo & Fabrizio Falchi

Authors

Giuseppe Amato
View author publications
You can also search for this author in PubMed Google Scholar
Franco Alberto Cardillo
View author publications
You can also search for this author in PubMed Google Scholar
Fabrizio Falchi
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Giuseppe Amato .

Editor information

Editors and Affiliations

CNR-IBAM Institute of Archaeological and Monumental Heritage, Tito Scalo, Potenza, Italy
Nicola Masini
Institute for Electromagnetic Sensing of the Environment (CNR-IREA), Napoli, Italy
Francesco Soldovieri

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Amato, G., Cardillo, F.A., Falchi, F. (2017). Technologies for Visual Localization and Augmented Reality in Smart Cities. In: Masini, N., Soldovieri, F. (eds) Sensing the Past. Geotechnologies and the Environment, vol 16. Springer, Cham. https://doi.org/10.1007/978-3-319-50518-3_20

Download citation

DOI: https://doi.org/10.1007/978-3-319-50518-3_20
Published: 12 April 2017
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-50516-9
Online ISBN: 978-3-319-50518-3
eBook Packages: Earth and Environmental ScienceEarth and Environmental Science (R0)

Publish with us

Policies and ethics