Advertisement

Privacy Preserving Visual SLAM

Conference paper
  • 533 Downloads
Part of the Lecture Notes in Computer Science book series (LNCS, volume 12367)

Abstract

This study proposes a privacy-preserving Visual SLAM framework for estimating camera poses and performing bundle adjustment with mixed line and point clouds in real time. Previous studies have proposed localization methods to estimate a camera pose using a line-cloud map for a single image or a reconstructed point cloud. These methods offer a scene privacy protection against the inversion attacks by converting a point cloud to a line cloud, which reconstruct the scene images from the point cloud. However, they are not directly applicable to a video sequence because they do not address computational efficiency. This is a critical issue to solve for estimating camera poses and performing bundle adjustment with mixed line and point clouds in real time. Moreover, there has been no study on a method to optimize a line-cloud map of a server with a point cloud reconstructed from a client video because any observation points on the image coordinates are not available to prevent the inversion attacks, namely the reversibility of the 3D lines. The experimental results with synthetic and real data show that our Visual SLAM framework achieves the intended privacy-preserving formation and real-time performance using a line-cloud map.

Keywords

Visual SLAM Privacy Line cloud Point cloud 

Supplementary material

504482_1_En_7_MOESM1_ESM.pdf (19.5 mb)
Supplementary material 1 (pdf 19922 KB)

Supplementary material 2 (mp4 42922 KB)

References

  1. 1.
    Agarwal, S., et al.: Building Rome in a day. Commun. ACM 54(10), 105–112 (2011)CrossRefGoogle Scholar
  2. 2.
    Alcantarilla, P.F., Bartoli, A., Davison, A.J.: KAZE features. In: European Conference on Computer Vision (ECCV), pp. 214–227 (2012)Google Scholar
  3. 3.
    Bay, H., Ess, A., Tuytelaars, T., Van Gool, L.: Speeded-up robust features (SURF). Comput. Vis. Image Underst. (CVIU) 110(3), 346–359 (2008)CrossRefGoogle Scholar
  4. 4.
    Cui, H., Gao, X., Shen, S., Hu, Z.: HSfM: hybrid structure-from-motion. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1212–1221 (2017)Google Scholar
  5. 5.
    Davison, A.J., Reid, I.D., Molton, N.D., Stasse, O.: MonoSLAM: real-time single camera SLAM. IEEE Trans. Pattern Anal. Mach. Intell. (TPAMI) 29(6), 1052–1067 (2007)CrossRefGoogle Scholar
  6. 6.
    DeTone, D., Malisiewicz, T., Rabinovich, A.: Superpoint: self-supervised interest point detection and description. In: IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), pp. 337–349 (2017)Google Scholar
  7. 7.
    Dong, R., Fremont, V., Lacroix, S., Fantoni, I., Changan, L.: Line-based monocular graph SLAM. In: IEEE International Conference on Multisensor Fusion and Integration for Intelligent Systems (MFI), pp. 494–500 (2017)Google Scholar
  8. 8.
    Dosovitskiy, A., Ros, G., Codevilla, F., Lopez, A., Koltun, V.: CARLA: an open urban driving simulator. In: Conference on Robot Learning (CoRL), pp. 1–16 (2017)Google Scholar
  9. 9.
    Eggert, D.W., Lorusso, A., Fisher, R.B.: Estimating 3-D rigid body transformations: a comparison of four major algorithms. Mach. Vis. Appl. (MVA) 9(5–6), 272–290 (1997)CrossRefGoogle Scholar
  10. 10.
    Engel, J., Koltun, V., Cremers, D.: Direct sparse odometry. IEEE Trans. Pattern Anal. Mach. Intell. (TPAMI) 40(3), 611–625 (2018)CrossRefGoogle Scholar
  11. 11.
    Engel, J., Schöps, T., Cremers, D.: LSD-SLAM: large-scale direct monocular SLAM. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8690, pp. 834–849. Springer, Cham (2014).  https://doi.org/10.1007/978-3-319-10605-2_54CrossRefGoogle Scholar
  12. 12.
    Fischler, M.A., Bolles, R.C.: Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography. Commun. ACM 24(6), 381–395 (1981)MathSciNetCrossRefGoogle Scholar
  13. 13.
    Galvez-Lopez, D., Tardos, J.: Bags of binary words for fast place recognition in image sequences. IEEE Trans. Robot. (TRO) 28(5), 1188–1197 (2012)CrossRefGoogle Scholar
  14. 14.
    Geiger, A., Lenz, P., Urtasun, R.: Are we ready for autonomous driving? the KITTI vision benchmark suite. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3354–3361 (2012)Google Scholar
  15. 15.
    Grisetti, G., Kümmerle, R., Stachniss, C., Burgard, W.: A tutorial on graph-based SLAM. IEEE Trans. Intell. Transp. Syst. (ITS) Mag. 2, 31–43 (2010)Google Scholar
  16. 16.
    Haralick, R.M., Lee, C.N., Ottenburg, K., Nölle, M.: Analysis and solutions of the three point perspective pose estimation problem. In: International Conference on Computer Vision and Pattern Recognition (CVPR), vol. 91, pp. 592–598 (1991)Google Scholar
  17. 17.
    Horn, B.: Closed-form solution of absolute orientation using unit quaternions. J. Opt. Soc. Am. A (JOSA A) 4, 629–642 (1987)Google Scholar
  18. 18.
    Huizhong, Z., Danping, Z., Pei, L., Ying, R., Liu, P., Wenxian, Y.: StructSLAM: visual SLAM with building structure lines. IEEE Trans. Veh. Technol. (TVT) 64(4), 1364–1375 (2015)CrossRefGoogle Scholar
  19. 19.
    Klein, G., Murray, D.: Parallel tracking and mapping for small AR workspaces. In: Proceedings of IEEE and ACM International Symposium on Mixed and Augmented Reality (ISMAR), pp. 225–234 (2007)Google Scholar
  20. 20.
    Kümmerle, R., Grisetti, G., Strasdat, H., Konolige, K., Burgard, W.: g2o: a general framework for graph optimization. In: IEEE International Conference on Robotics and Automation (ICRA), pp. 3607–3613 (2011)Google Scholar
  21. 21.
    Lourakis, M.I.A., Argyros, A.A.: SBA: A software package for generic sparse bundle adjustment. ACM Trans. Math. Softw. (TOMS) 36(1) (2009)Google Scholar
  22. 22.
    Lowe, D.: Distinctive image features from scale-invariant keypoints. Int. J. Comput. Vis. (IJCV) 60, 91–118 (2004)CrossRefGoogle Scholar
  23. 23.
    Mur-Artal, R., Montiel, J.M.M., Tardós, J.D.: ORB-SLAM: a versatile and accurate monocular SLAM system. IEEE Trans. Robot. (TRO) 31(5), 1147–1163 (2015)CrossRefGoogle Scholar
  24. 24.
    Mur-Artal, R., Tardós, J.D.: ORB-SLAM2: an open-source SLAM system for monocular, stereo and RGB-D cameras. IEEE Trans. Robot. (TRO) 33(5), 1255–1262 (2017)CrossRefGoogle Scholar
  25. 25.
    Newcombe, R., Lovegrove, S., Davison, A.: DTAM: dense tracking and mapping in real-time. In: Proceedings of IEEE International Conference on Computer Vision (ICCV), pp. 2320–2327 (2011)Google Scholar
  26. 26.
    Pittaluga, F., Koppal, S.J., Kang, S.B., Sinha, S.N.: Revealing scenes by inverting structure from motion reconstructions (2019)Google Scholar
  27. 27.
    Pumarola, A., Vakhitov, A., Agudo, A., Sanfeliu, A., Moreno-Noguer, F.: PL-SLAM: real-time monocular visual SLAM with points and lines. In: IEEE International Conference on Robotics and Automation (ICRA), pp. 4503–4508 (2017)Google Scholar
  28. 28.
    Quan, L., Lan, Z.: Linear N-point camera pose determination. IEEE Trans. Pattern Anal. Mach. Intell. (TPAMI) 21(8), 774–780 (1999)CrossRefGoogle Scholar
  29. 29.
    Rublee, E., Rabaud, V., Konolige, K., Bradski, G.: ORB: an efficient alternative to SIFT or SURF. In: IEEE International Conference on Computer Vision (ICCV), pp. 2564–2571 (2011)Google Scholar
  30. 30.
    Sattler, T., Leibe, B., Kobbelt, L.: Fast image-based localization using direct 2D-to-3D matching. In: IEEE International Conference on Computer Vision (ICCV), pp. 667–674 (2011)Google Scholar
  31. 31.
    Schlegel, D., Grisetti, G.: HBST: a hamming distance embedding binary search tree for feature-based visual place recognition. IEEE Robot. Autom. Lett. (RAL) 3(4), 3741–3748 (2018)CrossRefGoogle Scholar
  32. 32.
    Speciale, P., Schonberger, J.L., Kang, S.B., Sinha, S.N., Pollefeys, M.: Privacy preserving image-based localization. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 5493–5503 (2019)Google Scholar
  33. 33.
    Speciale, P., Schonberger, J.L., Sinha, S.N., Pollefeys, M.: Privacy preserving image queries for camera localization. In: IEEE International Conference on Computer Vision (ICCV), pp. 1486–1496 (2019)Google Scholar
  34. 34.
    Strasdat, H., Montiel, J., Davison, A.J.: Scale drift-aware large scale monocular SLAM. Robot.: Sci. Syst. VI 2(3), 7 (2010)Google Scholar
  35. 35.
    Sumikura, S., Shibuya, M., Sakurada, K.: OpenVSLAM: a versatile visual SLAM framework. In: ACM International Conference on Multimedia (ACMMM), pp. 2292–2295. ACM (2019)Google Scholar
  36. 36.
    Sweeney, C., Fragoso, V., Höllerer, T., Turk, M.: gDLS: a scalable solution to the generalized pose and scale problem. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8692, pp. 16–31. Springer, Cham (2014).  https://doi.org/10.1007/978-3-319-10593-2_2CrossRefGoogle Scholar
  37. 37.
    Tang, J., Ericson, L., Folkesson, J., Jensfelt, P.: GCNv2: efficient correspondence prediction for real-time SLAM. IEEE Robot. Autom. Lett. (RAL) 4(4), 3505–3512 (2019)Google Scholar
  38. 38.
    Tateno, K., Tombari, F., Laina, I., Navab, N.: CNN-SLAM: real-time dense monocular SLAM with learned depth prediction. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 6243–6252 (2017)Google Scholar
  39. 39.
    Triggs, B., McLauchlan, P.F., Hartley, R.I., Fitzgibbon, A.W.: Bundle adjustment — a modern synthesis. In: Triggs, B., Zisserman, A., Szeliski, R. (eds.) IWVA 1999. LNCS, vol. 1883, pp. 298–372. Springer, Heidelberg (2000).  https://doi.org/10.1007/3-540-44480-7_21CrossRefGoogle Scholar
  40. 40.
    Umeyama, S.: Least-squares estimation of transformation parameters between two point patterns. IEEE Trans. Pattern Anal. Mach. Intell. (TPAMI) 4, 376–380 (1991)CrossRefGoogle Scholar
  41. 41.
    Wu, C., Agarwal, S., Curless, B., Seitz, S.: Multicore bundle adjustment. In: International Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3057–3064 (2011)Google Scholar
  42. 42.
    Yang, N., Wang, R., Stckler, J., Cremers, D.: Deep virtual stereo odometry: leveraging deep depth prediction for monocular direct sparse odometry. In: European Conference on Computer Vision (ECCV), pp. 835–852 (2018)Google Scholar
  43. 43.
    Yi, K.M., Trulls, E., Lepetit, V., Fua, P.: LIFT: learned invariant feature transform. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9910, pp. 467–483. Springer, Cham (2016).  https://doi.org/10.1007/978-3-319-46466-4_28CrossRefGoogle Scholar
  44. 44.
    Zhou, H., Ummenhofer, B., Brox, T.: DeepTAM: deep tracking and mapping. In: European Conference on Computer Vision (ECCV), pp. 851–868 (2018)Google Scholar

Copyright information

© Springer Nature Switzerland AG 2020

Authors and Affiliations

  1. 1.National Institute of Advanced Industrial Science and Technology (AIST)TokyoJapan
  2. 2.Tokyo Institute of TechnologyMeguroJapan

Personalised recommendations