Abstract
The main question we address is whether it is possible to crowdsource navigational data in the form of video sequences captured from wearable cameras. Without using geometric inference techniques (such as SLAM), we test video data for its location-discrimination content. Tracking algorithms do not form part of this assessment, because our goal is to compare different visual descriptors for the purpose of location inference in highly ambiguous indoor environments. The testing of these descriptors, and different encoding methods, is performed by measuring the positional error inferred during one journey with respect to other journeys along the same approximate path.
There are three main contributions described in this paper. First, we compare different techniques for visual feature extraction with the aim of associating locations between different journeys along roughly the same physical route. Secondly, we suggest measuring the quality of position inference relative to multiple passes through the same route by introducing a positional estimate of ground truth that is determined with modified surveying instrumentation. Finally, we contribute a database of nearly 100,000 frames with this positional ground-truth. More than 3 km worth of indoor journeys with a hand-held device (Nexus 4) and a wearable device (Google Glass) are included in this dataset.
Chapter PDF
Similar content being viewed by others
Keywords
- Video Sequence
- Receive Signal Strength Indication
- Visual Data
- Indoor Localization
- British Machine Vision
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
References
Bosse, M.: Simultaneous Localization and Map Building in Large-Scale Cyclic Environments Using the Atlas Framework. The International Journal of Robotics Research 23(12), 1113–1139 (2004)
Bregonzio, M.: Recognising action as clouds of space-time interest points. IEEE Conference on Computer Vision and Pattern Recognition (CVPR) pp. 1948–1955 (June 2009). http://ieeexplore.ieee.org/lpdocs/epic03/wrapper.htm?arnumber=5206779
Burgess, N., Maguire, E.A., O’Keefe, J.: The human hippocampus and spatial and episodic memory. Neuron 35(4), 625–641 (2002)
Chatfield, K., Lempitsky, V., Vedaldi, A., Zisserman, A.: The devil is in the details: an evaluation of recent feature encoding methods. In: Procedings of the British Machine Vision Conference 2011 (1), 76.1-76.12 (2011). http://www.bmva.org/bmvc/2011/proceedings/paper76/index.html
Dalal, N., Triggs, B.: Histograms of Oriented Gradients for Human Detection. IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2005) 1, 886–893 (2005). http://ieeexplore.ieee.org/lpdocs/epic03/wrapper.htm?arnumber=1467360
Everingham, M., Gool, L., Williams, C.K.I., Winn, J., Zisserman, A.: The Pascal Visual Object Classes (VOC) Challenge. International Journal of Computer Vision 88(2), 303–338 (2009). http://www.springerlink.com/index/10.1007/s11263-009-0275-4
Hartley, T., Lever, C., Burgess, N., O’Keefe, J.: Space in the brain: how the hippocampal formation supports spatial cognition. Philosophical Transactions of the Royal Society B: Biological Sciences 369(1635), 20120510 (2014)
Huitl, R., Schroth, G.: TUMindoor: An extensive image and point cloud dataset for visual indoor localization and mapping. In: International Conference on Image Processing (2012). http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=6467224
Kadous, W., Peterson, S.: Indoor maps: the next frontier. In: Google IO (2013)
Kläser, A., Marszalek, M., Schmid, C.: A spatio-temporal descriptor based on 3D-gradients. In: British Machine Vision Conference. pp. 995–1004 (2008). http://eprints.pascal-network.org/archive/00005039/
Layton, O.W., Browning, N.A.: A Unified Model of Heading and Path Perception in Primate MSTd. PLoS Computational Biology 10(2), e1003476, February 2014. http://dx.plos.org/10.1371/journal.pcbi.1003476
Lazebnik, S., Schmid, C., Ponce, J.: Beyond bags of features: Spatial pyramid matching for recognizing natural scene categories. In: Computer Vision and Pattern Recognition. vol. 2, pp. 2169–2178. IEEE (2006)
Lowe, D.G.: Distinctive image features from scale-invariant keypoints. International journal of computer vision 60, 91–110 (2004). http://www.springerlink.com/index/H4L02691327PX768.pdf
Matsumoto, Y., Inaba, M., Inoue, H.: Visual navigation using view-sequenced route representation. In: International Conference on Robotics and Automation, pp. 83–88. No., April 1996. http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=503577
Ohno, T., Ohya, A., Yuta, S.: Autonomous Navigation for Mobile Robots Referring Pre-recorded Image Sequence. In: IEEE/RSJ International Conference on Intelligent Robots and Systems. IROS 1996. vol. 2, pp. 672–679. IEEE (1996). http://ieeexplore.ieee.org/lpdocs/epic03/wrapper.htm?arnumber=571034
Park, S., Jung, S., Song, Y., Kim, H.: Mobile robot localization in indoor environment using scale-invariant visual landmarks. In: 18th IAPR International Conference in Pattern Recognition, pp. 159–163 (2008). http://www.eurasip.org/Proceedings/Ext/CIP2008/papers/1569094833.pdf
Quigley, M., Stavens, D.: Sub-meter indoor localization in unmodified environments with inexpensive sensors. In: Intelligent Robots and Systems, pp. 2039–2046. IEEE, October 2010
Rivera-Rubio, J., Alexiou, I., Bharath, A.A.: RSM dataset (2014). http://rsm.bicv.org
Schroth, G., Huitl, R.: Mobile visual location recognition. IEEE Signal Processing Magazine, 77–89, July 2011. http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=5888650
Schroth, G., Huitl, R.: Exploiting prior knowledge in mobile visual location recognition. In: IEEE ICASSP, pp. 4–7 (2012). http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=6288388
Shen, G., Chen, Z., Zhang, P., Moscibroda, T., Zhang, Y.: Walkie-Markie: Indoor Pathway Mapping Made Easy. In: 10th USENIX Symposium on Networked Systems Design and Implementation (NSDI 13) USENIX, pp. 85–98 (2013). http://research.microsoft.com/en-us/um/people/moscitho/Publications/NSDI_2013.pdf
Simpson, R., Cullip, J., Revell, J.: The Cheddar Gorge Data Set (2011)
Sturm, J., Engelhard, N., Endres, F., Burgard, W., Cremers, D.: A benchmark for the evaluation of rgb-d slam systems. In: Proc. of the International Conference on Intelligent Robot Systems (IROS), October 2012
Tang, L., Yuta, S.: Vision based navigation for mobile robots in indoor environment by teaching and playing-back scheme. In: International Conference on Robotics and Automation, pp. 3072–3077 (2001). http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=933089
Vedaldi, A., Fulkerson, B.: VLFeat: An Open and Portable Library of Computer Vision Algorithms (2008). http://www.vlfeat.org/
Vedaldi, A., Zisserman, A.: Efficient additive kernels via explicit feature maps. IEEE Transactions on Pattern Analysis and Machine Intelligence (2010)
Viola, P., Jones, M.: Robust real-time object detection. International Journal of Computer Vision, pp. 1–25 (2001). http://www.staroceans.net/documents/CRL-2001-1.pdf
Wang, H., Sen, S., Elgohary, A., Farid, M., Youssef, M.: Unsupervised Indoor Localization. In: MobiSys. ACM (2012). http://synrg.ee.duke.edu/papers/unloc.pdf
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Rivera-Rubio, J., Alexiou, I., A. Bharath, A. (2015). Associating Locations Between Indoor Journeys from Wearable Cameras. In: Agapito, L., Bronstein, M., Rother, C. (eds) Computer Vision - ECCV 2014 Workshops. ECCV 2014. Lecture Notes in Computer Science(), vol 8928. Springer, Cham. https://doi.org/10.1007/978-3-319-16220-1_3
Download citation
DOI: https://doi.org/10.1007/978-3-319-16220-1_3
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-16219-5
Online ISBN: 978-3-319-16220-1
eBook Packages: Computer ScienceComputer Science (R0)