Abstract
The widespread diffusion of smart devices, such as smartphones and tablets, and the new emerging trend of wearable devices, such as smart glasses and smart watches, has pushed forward the development of applications where the user can interact relying on his or her position and field of view. In this way, users can also receive additional information in augmented reality, that is, seeing the information through the smart device, overlaid on top of the real scene. The GPS or the compass can be used to localize the user when augmented reality has to be provided with scenes of large size, for instance, squares or large buildings. However, when augmented reality has to be offered for enriching the view of small objects or small details of larger objects, for instance, statues, paintings, or epigraphs, a more precise positioning is needed. Visual object recognition and tracking technologies offer very detailed and fine-grained positioning capabilities. This chapter discusses the techniques enabling a precise positioning of the user and the subsequent experience in augmented reality, focusing on algorithms for image matching and homography estimation between the images seen by smart devices and images representing objects of interest.
Notes
- 1.
CPU: central processing unit; GPU: graphics processing unit.
- 2.
- 3.
- 4.
- 5.
- 6.
- 7.
- 8.
- 9.
- 10.
- 11.
- 12.
References
Akusok A, Miche Y, Karhunen J, Bjork KM, Nian R, Lendasse A (2015) Arbitrary category classification of websites based on image content. IEEE Comput Intell Mag 10(2):30–41. 10.1109/MCI.2015.2405317
Amato G, Bolettieri P, Falchi F, Gennaro C (2013) Large scale image retrieval using vector of locally aggregated descriptors. In: Similarity search and applications. Springer, Heidelberg, pp 245–256
Amato G, Falchi F, Claudio G (2015) Fast image classification for monument recognition. J Comput Cult Herit 8(4):18:1–18:25. http://dl.acm.org/citation.cfm?id=2724727
Amato G, Falchi F, Gennaro C (2011a) Geometric consistency checks for KNN based image classification relying on local features. In: Proceedings of the fourth international conference on SImilarity Search and APplications (SISAP’11). ACM, New York, pp 81–88. 10.1145/1995412.1995428. http://doi.acm.org/10.1145/1995412.1995428
Amato G, Falchi F, Gennaro C (2011b) Geometric consistency checks for KNN based image classification relying on local features. In: Proceedings of the fourth international conference on SImilarity Search and APplications. ACM, New York, pp 81–88
Azuma RT et al (1997) A survey of augmented reality. Presence 6(4):355–385
Bay H, Tuytelaars T, Van Gool L (2006) Surf: speeded up robust features. In: Leonardis A, Bischof H, Pinz A (eds) Computer vision – ECCV 2006. Lecture notes in computer science, vol 3951. Springer, Berlin/Heidelberg, pp 404–417
Bergamasco F, Albarelli A, Rodola E, Torsello A (2011) Rune-tag: a high accuracy fiducial marker with strong occlusion resilience. In: IEEE conference on computer vision and pattern recognition (CVPR 2011). IEEE, pp 113–120
Bhattacharya P, Gavrilova M (2013) A survey of landmark recognition using the bag-of-words framework. In: Intelligent computer graphics 2012. Springer, pp 243–263
Chum O, Philbin J, Sivic J, Isard M, Zisserman A (2007) Total recall: automatic query expansion with a generative feature model for object retrieval. In: IEEE 11th international conference on computer vision (ICCV 2007). IEEE, pp 1–8
Crandall DJ, Backstrom L, Huttenlocher D, Kleinberg J (2009) Mapping the world’s photos. In: Proceedings of the 18th international conference on world wide web. ACM, pp 761–770
Datar M, Immorlica N, Indyk P, Mirrokni VS (2004) Locality-sensitive hashing scheme based on p-stable distributions. In: Proceedings of the twentieth annual symposium on computational geometry. ACM, pp 253–262
Datta R, Li J, Wang JZ (2005) Content-based image retrieval: approaches and trends of the new age. In: Proceedings of the 7th ACM SIGMM international workshop on multimedia information retrieval (MIR’05). ACM, New York, pp 253–262
Delhumeau J, Gosselin PH, Jégou H, Pérez P (2013) Revisiting the VLAD image representation. In: Proceedings of the 21st ACM international conference on multimedia. ACM, pp 653–656
Fischler MA, Bolles RC (1981) Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography. Commun ACM 24(6):381–395
Hartley R, Zisserman A (2003) Multiple view geometry in computer vision. Cambridge University Press, Cambridge/New York
Heinly J, Dunn E, Frahm JM (2012) Comparative evaluation of binary features. In: Computer vision–ECCV 2012. Springer, pp 759–773
Jegou H, Douze M, Schmid C (2008) Hamming embedding and weak geometric consistency for large scale image search. In: Computer vision–ECCV 2008. Springer, pp 304–317
Jégou H, Douze M, Schmid C, Pérez P (2010) Aggregating local descriptors into a compact image representation. In: IEEE conference on computer vision & pattern recognition, IEEE Computer Society, Washington DC, http://ieeexplore.ieee.org/document/5540039/
Kumar DS, Jawahar C (2006) Robust homography-based control for camera positioning in piecewise planar environments. In: Computer vision, graphics and image processing. Springer, pp 906–918
Lazebnik S, Schmid C, Ponce J (2006) Beyond bags of features: spatial pyramid matching for recognizing natural scene categories. In: IEEE computer society conference on computer vision and pattern recognition 2006, vol 2. IEEE, pp 2169–2178
Leutenegger S, Chli M, Siegwart R (2011) Brisk: binary robust invariant scalable keypoints. In: IEEE international conference on computer vision (ICCV 2011). IEEE, pp 2548–2555
Lowe D (1999) Object recognition from local scale-invariant features. In: The proceedings of the seventh IEEE international conference on computer vision 1999, vol 2. IEEE, pp 1150–1157
Lucas BD, Kanade T et al (1981) An iterative image registration technique with an application to stereo vision. In: IJCAI, IEEE Computer Society, Washington DC, vol 81. pp 674–679. http://ieeexplore.ieee.org/document/5540039/
Novak D, Batko M, Zezula P (to appear) Large-scale image retrieval using neural net descriptors. In:Â 38th ACM SIGIR international conference of research and development on information retrieval. Springer
Novak D, Zezula P (2014) Rank aggregation of candidate sets for efficient similarity search. In: Database and expert systems applications. Springer, pp 42–58
Perronnin F, Dance C (2007) Fisher kernels on visual vocabularies for image categorization. In: IEEE conference on computer vision and pattern recognition (CVPR’07). IEEE, pp 1–8
Philbin J, Chum O, Isard M, Sivic J, Zisserman A (2007) Object retrieval with large vocabularies and fast spatial matching. In: IEEE conference on computer vision and pattern recognition (CVPR’07). IEEE, pp 1–8
Razavian AS, Azizpour H, Sullivan J, Carlsson S (2014) CNN features off-the-shelf: an astounding baseline for recognition. In: IEEE conference on computer vision and pattern recognition workshops (CVPRW 2014). IEEE, pp 512–519
Rublee E, Rabaud V, Konolige K, Bradski G (2011) Orb: an efficient alternative to sift or surf. In: IEEE international conference on computer vision (ICCV 2011). IEEE, pp 2564–2571
Schroth G, Huitl R, Chen D, Abu-Alqumsan M, Al-Nuaimi A, Steinbach E (2011) Mobile visual location recognition. IEEE Signal Process Mag 28(4):77–89
Shi J, Tomasi C (1994) Good features to track. In: IEEE computer society conference on computer vision and pattern recognition (CVPR 1994). IEEE, pp 593–600
Sivic J, Zisserman A (2003) Video Google: a text retrieval approach to object matching in videos. In: Proceedings of the ninth IEEE international conference on computer vision (ICCV’03), vol 2, pp 1470. IEEE Computer Society, Washington, DC
Snavely N, Seitz SM, Szeliski R (2006) Photo tourism: exploring photo collections in 3D. In: ACM transactions on graphics (TOG), vol 25. ACM, pp 835–846
Tolias G, Bursuc A, Furon T, Jégou H (2015) Rotation and translation covariant match kernels for image retrieval. Comput Vis Image Underst 140:9–20
Weyand T, Leibe B (2015) Visual landmark recognition from internet photo collections: a large-scale evaluation. Comput Vis Image Underst 135:1–15
Yang X, Cheng KT (2014) Local difference binary for ultrafast and distinctive feature description. IEEE Trans Pattern Anal Mach Intell 36(1):188–194
Zhang Z (2000) A flexible new technique for camera calibration. IEEE Trans Pattern Anal Mach Intell 22(11):1330–1334
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer International Publishing AG
About this chapter
Cite this chapter
Amato, G., Cardillo, F.A., Falchi, F. (2017). Technologies for Visual Localization and Augmented Reality in Smart Cities. In: Masini, N., Soldovieri, F. (eds) Sensing the Past. Geotechnologies and the Environment, vol 16. Springer, Cham. https://doi.org/10.1007/978-3-319-50518-3_20
Download citation
DOI: https://doi.org/10.1007/978-3-319-50518-3_20
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-50516-9
Online ISBN: 978-3-319-50518-3
eBook Packages: Earth and Environmental ScienceEarth and Environmental Science (R0)