Advertisement

PointPWC-Net: Cost Volume on Point Clouds for (Self-)Supervised Scene Flow Estimation

Conference paper
  • 787 Downloads
Part of the Lecture Notes in Computer Science book series (LNCS, volume 12350)

Abstract

We propose a novel end-to-end deep scene flow model, called PointPWC-Net, that directly processes 3D point cloud scenes with large motions in a coarse-to-fine fashion. Flow computed at the coarse level is upsampled and warped to a finer level, enabling the algorithm to accommodate for large motion without a prohibitive search space. We introduce novel cost volume, upsampling, and warping layers to efficiently handle 3D point cloud data. Unlike traditional cost volumes that require exhaustively computing all the cost values on a high-dimensional grid, our point-based formulation discretizes the cost volume onto input 3D points, and a PointConv operation efficiently computes convolutions on the cost volume. Experiment results on FlyingThings3D and KITTI outperform the state-of-the-art by a large margin. We further explore novel self-supervised losses to train our model and achieve comparable results to state-of-the-art trained with supervised loss. Without any fine-tuning, our method also shows great generalization ability on the KITTI Scene Flow 2015 dataset, outperforming all previous methods. The code is released at https://github.com/DylanWusee/PointPWC.

Keywords

Cost volume Self-supervision Coarse-to-fine Scene flow 

Notes

Acknowledgement

Wenxuan Wu and Li Fuxin were partially supported by the National Science Foundation (NSF) under Project #1751402, USDA National Institute of Food and Agriculture (USDA-NIFA) under Award 2019-67019-29462, as well as by the Defense Advanced Research Projects Agency (DARPA) under Contract No. N66001-17-12-4030 and N66001-19-2-4035. Any opinions, findings and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the funding agencies.

Supplementary material

504441_1_En_6_MOESM1_ESM.pdf (3.4 mb)
Supplementary material 1 (pdf 3442 KB)

Supplementary material 2 (mp4 46867 KB)

References

  1. 1.
    Amberg, B., Romdhani, S., Vetter, T.: Optimal step nonrigid ICP algorithms for surface registration. In: 2007 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–8. IEEE (2007)Google Scholar
  2. 2.
    Bai, L., Yang, X., Gao, H.: Nonrigid point set registration by preserving local connectivity. IEEE Trans. Cybern. 48(3), 826–835 (2017)Google Scholar
  3. 3.
    Behl, A., Paschalidou, D., Donné, S., Geiger, A.: PointFlowNet: learning representations for rigid motion estimation from point clouds. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7962–7971 (2019)Google Scholar
  4. 4.
    Bergen, J.R., Anandan, P., Hanna, K.J., Hingorani, R.: Hierarchical model-based motion estimation. In: Sandini, G. (ed.) ECCV 1992. LNCS, vol. 588, pp. 237–252. Springer, Heidelberg (1992).  https://doi.org/10.1007/3-540-55426-2_27CrossRefGoogle Scholar
  5. 5.
    Besl, P.J., McKay, N.D.: Method for registration of 3-D shapes. In: Sensor Fusion IV: Control Paradigms and Data Structures, vol. 1611, pp. 586–606. International Society for Optics and Photonics (1992)Google Scholar
  6. 6.
    Brown, B.J., Rusinkiewicz, S.: Global non-rigid alignment of 3-D scans. In: ACM SIGGRAPH 2007 papers, p. 21-es (2007)Google Scholar
  7. 7.
    Brox, T., Bruhn, A., Papenberg, N., Weickert, J.: High accuracy optical flow estimation based on a theory for warping. In: Pajdla, T., Matas, J. (eds.) ECCV 2004. LNCS, vol. 3024, pp. 25–36. Springer, Heidelberg (2004).  https://doi.org/10.1007/978-3-540-24673-2_3CrossRefGoogle Scholar
  8. 8.
    Bruhn, A., Weickert, J., Schnörr, C.: Lucas/Kanade meets Horn/Schunck: combining local and global optic flow methods. Int. J. Comput. Vis. 61(3), 211–231 (2005).  https://doi.org/10.1023/B:VISI.0000045324.43199.43CrossRefGoogle Scholar
  9. 9.
    Chabra, R., Straub, J., Sweeney, C., Newcombe, R., Fuchs, H.: StereoDRNet: dilated residual StereoNet. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 11786–11795 (2019)Google Scholar
  10. 10.
    Choi, S., Zhou, Q.Y., Koltun, V.: Robust reconstruction of indoor scenes. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5556–5565 (2015)Google Scholar
  11. 11.
    Danelljan, M., Meneghetti, G., Shahbaz Khan, F., Felsberg, M.: A probabilistic framework for color-based point set registration. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1818–1826 (2016)Google Scholar
  12. 12.
    Dewan, A., Caselitz, T., Tipaldi, G.D., Burgard, W.: Rigid scene flow for 3D LiDAR scans. In: 2016 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 1765–1770. IEEE (2016)Google Scholar
  13. 13.
    Dosovitskiy, A., et al.: FlowNet: learning optical flow with convolutional networks. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2758–2766 (2015)Google Scholar
  14. 14.
    Fan, H., Su, H., Guibas, L.J.: A point set generation network for 3D object reconstruction from a single image. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 605–613 (2017)Google Scholar
  15. 15.
    Fu, M., Zhou, W.: Non-rigid point set registration via mixture of asymmetric Gaussians with integrated local structures. In: 2016 IEEE International Conference on Robotics and Biomimetics (ROBIO), pp. 999–1004. IEEE (2016)Google Scholar
  16. 16.
    Ge, S., Fan, G.: Non-rigid articulated point set registration with local structure preservation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp. 126–133 (2015)Google Scholar
  17. 17.
    Ge, S., Fan, G., Ding, M.: Non-rigid point set registration with global-local topology preservation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp. 245–251 (2014)Google Scholar
  18. 18.
    Gelfand, N., Mitra, N.J., Guibas, L.J., Pottmann, H.: Robust global registration. In: Symposium on Geometry Processing, Vienna, Austria, vol. 2, p. 5 (2005)Google Scholar
  19. 19.
    Groh, F., Wieschollek, P., Lensch, H.P.A.: Flex-convolution. In: Jawahar, C.V., Li, H., Mori, G., Schindler, K. (eds.) ACCV 2018. LNCS, vol. 11361, pp. 105–122. Springer, Cham (2019).  https://doi.org/10.1007/978-3-030-20887-5_7CrossRefGoogle Scholar
  20. 20.
    Gu, X., Wang, Y., Wu, C., Lee, Y.J., Wang, P.: HPLFlowNet: hierarchical permutohedral lattice FlowNet for scene flow estimation on large-scale point clouds. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3254–3263 (2019)Google Scholar
  21. 21.
    Guo, Y., Bennamoun, M., Sohel, F., Lu, M., Wan, J.: 3D object recognition in cluttered scenes with local surface features: a survey. IEEE Trans. Pattern Anal. Mach. Intell. 36(11), 2270–2287 (2014)Google Scholar
  22. 22.
    Hermosilla, P., Ritschel, T., Vázquez, P.P., Vinacua, À., Ropinski, T.: Monte Carlo convolution for learning on non-uniformly sampled point clouds. In: SIGGRAPH Asia 2018 Technical Papers, p. 235. ACM (2018)Google Scholar
  23. 23.
    Holz, D., Ichim, A.E., Tombari, F., Rusu, R.B., Behnke, S.: Registration with the point cloud library: a modular framework for aligning in 3-D. IEEE Robot. Autom. Mag. 22(4), 110–124 (2015)Google Scholar
  24. 24.
    Horn, B.K., Schunck, B.G.: Determining optical flow. Artif. Intell. 17(1–3), 185–203 (1981)Google Scholar
  25. 25.
    Hua, B.S., Tran, M.K., Yeung, S.K.: Pointwise convolutional neural networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 984–993 (2018)Google Scholar
  26. 26.
    Huguet, F., Devernay, F.: A variational method for scene flow estimation from stereo sequences. In: 2007 IEEE 11th International Conference on Computer Vision, pp. 1–7. IEEE (2007)Google Scholar
  27. 27.
    Hur, J., Roth, S.: Self-supervised monocular scene flow estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 7396–7405 (2020)Google Scholar
  28. 28.
    Ilg, E., Mayer, N., Saikia, T., Keuper, M., Dosovitskiy, A., Brox, T.: FlowNet 2.0: evolution of optical flow estimation with deep networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2462–2470 (2017)Google Scholar
  29. 29.
    Jain, V., Zhang, H., van Kaick, O.: Non-rigid spectral correspondence of triangle meshes. Int. J. Shape Model. 13(1), 101–124 (2007)MathSciNetzbMATHGoogle Scholar
  30. 30.
    Jia, X., De Brabandere, B., Tuytelaars, T., Gool, L.V.: Dynamic filter networks. In: Advances in Neural Information Processing Systems, pp. 667–675 (2016)Google Scholar
  31. 31.
    Kendall, A., et al.: End-to-end learning of geometry and context for deep stereo regression. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 66–75 (2017)Google Scholar
  32. 32.
    Lee, M., Fowlkes, C.C.: CeMNet: self-supervised learning for accurate continuous ego-motion estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops. pp. 354–363 (2019)Google Scholar
  33. 33.
    Lei, H., Jiang, G., Quan, L.: Fast descriptors and correspondence propagation for robust global point cloud registration. IEEE Trans. Image Process. 26(8), 3614–3623 (2017)MathSciNetzbMATHGoogle Scholar
  34. 34.
    Li, Y., Bu, R., Sun, M., Wu, W., Di, X., Chen, B.: PointCNN: convolution on x-transformed points. In: Advances in Neural Information Processing Systems, pp. 820–830 (2018)Google Scholar
  35. 35.
    Liu, L., Zhai, G., Ye, W., Liu, Y.: Unsupervised learning of scene flow estimation fusing with local rigidity. In: Proceedings of the 28th International Joint Conference on Artificial Intelligence, pp. 876–882. AAAI Press (2019)Google Scholar
  36. 36.
    Liu, X., Qi, C.R., Guibas, L.J.: FlowNet3D: learning scene flow in 3D point clouds. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 529–537 (2019)Google Scholar
  37. 37.
    Lu, M., Zhao, J., Guo, Y., Ma, Y.: Accelerated coherent point drift for automatic three-dimensional point cloud registration. IEEE Geosci. Remote Sens. Lett. 13(2), 162–166 (2015)Google Scholar
  38. 38.
    Ma, J., Chen, J., Ming, D., Tian, J.: A mixture model for robust point matching under multi-layer motion. PLOS One 9(3), e92282 (2014)Google Scholar
  39. 39.
    Ma, J., Jiang, J., Zhou, H., Zhao, J., Guo, X.: Guided locality preserving feature matching for remote sensing image registration. IEEE Trans. Geosci. Remote Sens. 56(8), 4435–4447 (2018)Google Scholar
  40. 40.
    Ma, J., Zhao, J., Jiang, J., Zhou, H.: Non-rigid point set registration with robust transformation estimation under manifold regularization. In: 31st AAAI Conference on Artificial Intelligence (2017)Google Scholar
  41. 41.
    Ma, J., Zhao, J., Yuille, A.L.: Non-rigid point set registration by preserving global and local structures. IEEE Trans. Image Process. 25(1), 53–64 (2015)MathSciNetzbMATHGoogle Scholar
  42. 42.
    Ma, J., Zhou, H., Zhao, J., Gao, Y., Jiang, J., Tian, J.: Robust feature matching for remote sensing image registration via locally linear transforming. IEEE Trans. Geosci. Remote Sens. 53(12), 6469–6481 (2015)Google Scholar
  43. 43.
    Makadia, A., Patterson, A., Daniilidis, K.: Fully automatic registration of 3D point clouds. In: 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, CVPR 2006, vol. 1, pp. 1297–1304. IEEE (2006)Google Scholar
  44. 44.
    Mayer, N., et al.: A large dataset to train convolutional networks for disparity, optical flow, and scene flow estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4040–4048 (2016)Google Scholar
  45. 45.
    Menze, M., Geiger, A.: Object scene flow for autonomous vehicles. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3061–3070 (2015)Google Scholar
  46. 46.
    Menze, M., Heipke, C., Geiger, A.: Joint 3D estimation of vehicles and scene flow. In: ISPRS Workshop on Image Sequence Analysis (ISA) (2015)Google Scholar
  47. 47.
    Menze, M., Heipke, C., Geiger, A.: Object scene flow. ISPRS J. Photogram. Remote Sens. (JPRS) 140, 60–76 (2018)Google Scholar
  48. 48.
    Mian, A.S., Bennamoun, M., Owens, R.: Three-dimensional model-based object recognition and segmentation in cluttered scenes. IEEE Trans. Pattern Anal. Mach. Intell. 28(10), 1584–1601 (2006)Google Scholar
  49. 49.
    Mittal, H., Okorn, B., Held, D.: Just go with the flow: self-supervised scene flow estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 11177–11185 (2020)Google Scholar
  50. 50.
    Myronenko, A., Song, X.: Point set registration: coherent point drift. IEEE Trans. Pattern Anal. Mach. Intell. 32(12), 2262–2275 (2010)Google Scholar
  51. 51.
    Nguyen, T.M., Wu, Q.J.: Multiple kernel point set registration. IEEE Trans. Med. Imaging 35(6), 1381–1394 (2015)Google Scholar
  52. 52.
    Pomerleau, F., Colas, F., Siegwart, R., Magnenat, S.: Comparing ICP variants on real-world data sets. Auton. Robot. 34(3), 133–148 (2013)Google Scholar
  53. 53.
    Qi, C.R., Su, H., Mo, K., Guibas, L.J.: PointNet: deep learning on point sets for 3D classification and segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 652–660 (2017)Google Scholar
  54. 54.
    Qi, C.R., Yi, L., Su, H., Guibas, L.J.: Pointnet++: deep hierarchical feature learning on point sets in a metric space. In: Advances in Neural Information Processing Systems, pp. 5099–5108 (2017)Google Scholar
  55. 55.
    Qu, H.B., Wang, J.Q., Li, B., Yu, M.: Probabilistic model for robust affine and non-rigid point set matching. IEEE Trans. Pattern Anal. Mach. Intell. 39(2), 371–384 (2016)Google Scholar
  56. 56.
    Ranjan, A., Black, M.J.: Optical flow estimation using a spatial pyramid network. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4161–4170 (2017)Google Scholar
  57. 57.
    Ravanbakhsh, S., Schneider, J., Poczos, B.: Deep learning with sets and point clouds. arXiv preprint arXiv:1611.04500 (2016)
  58. 58.
    Revaud, J., Weinzaepfel, P., Harchaoui, Z., Schmid, C.: EpicFlow: edge-preserving interpolation of correspondences for optical flow. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1164–1172 (2015)Google Scholar
  59. 59.
    Saval-Calvo, M., Azorin-Lopez, J., Fuster-Guillo, A., Villena-Martinez, V., Fisher, R.B.: 3D non-rigid registration using color: color coherent point drift. Comput. Vis. Image Underst. 169, 119–135 (2018)Google Scholar
  60. 60.
    Shin, J., Triebel, R., Siegwart, R.: Unsupervised discovery of repetitive objects. In: 2010 IEEE International Conference on Robotics and Automation, pp. 5041–5046. IEEE (2010)Google Scholar
  61. 61.
    Simonovsky, M., Komodakis, N.: Dynamic edge-conditioned filters in convolutional neural networks on graphs. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3693–3702 (2017)Google Scholar
  62. 62.
    Sorkine, O.: Laplacian mesh processing. In: Eurographics (STARs), pp. 53–70 (2005)Google Scholar
  63. 63.
    Su, H., et al.: SPLATNet: sparse lattice networks for point cloud processing. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2530–2539 (2018)Google Scholar
  64. 64.
    Sun, D., Roth, S., Black, M.J.: A quantitative analysis of current practices in optical flow estimation and the principles behind them. Int. J. Comput. Vis. 106(2), 115–137 (2014)Google Scholar
  65. 65.
    Sun, D., Yang, X., Liu, M.Y., Kautz, J.: PWC-Net: CNNs for optical flow using pyramid, warping, and cost volume. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 8934–8943 (2018)Google Scholar
  66. 66.
    Tam, G.K., et al.: Registration of 3D point clouds and meshes: a survey from rigid to nonrigid. IEEE Trans. Visual Comput. Graph. 19(7), 1199–1217 (2012)Google Scholar
  67. 67.
    Tao, W., Sun, K.: Asymmetrical Gauss mixture models for point sets matching. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1598–1605 (2014)Google Scholar
  68. 68.
    Tatarchenko, M., Park, J., Koltun, V., Zhou, Q.Y.: Tangent convolutions for dense prediction in 3D. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3887–3896 (2018)Google Scholar
  69. 69.
    Theiler, P.W., Wegner, J.D., Schindler, K.: Globally consistent registration of terrestrial laser scans via graph optimization. ISPRS J. Photogram. Remote Sens. 109, 126–138 (2015)Google Scholar
  70. 70.
    Ushani, A.K., Eustice, R.M.: Feature learning for scene flow estimation from LiDAR. In: Conference on Robot Learning, pp. 283–292 (2018)Google Scholar
  71. 71.
    Ushani, A.K., Wolcott, R.W., Walls, J.M., Eustice, R.M.: A learning approach for real-time temporal scene flow estimation from LIDAR data. In: 2017 IEEE International Conference on Robotics and Automation (ICRA), pp. 5666–5673. IEEE (2017)Google Scholar
  72. 72.
    Vedula, S., Baker, S., Rander, P., Collins, R., Kanade, T.: Three-dimensional scene flow. In: Proceedings of the 7th IEEE International Conference on Computer Vision, vol. 2, pp. 722–729. IEEE (1999)Google Scholar
  73. 73.
    Verma, N., Boyer, E., Verbeek, J.: FeaStNet: feature-steered graph convolutions for 3D shape analysis. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2598–2606 (2018)Google Scholar
  74. 74.
    Vogel, C., Schindler, K., Roth, S.: Piecewise rigid scene flow. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1377–1384 (2013)Google Scholar
  75. 75.
    Vogel, C., Schindler, K., Roth, S.: 3D scene flow estimation with a piecewise rigid scene model. Int. J. Comput. Vis. 115(1), 1–28 (2015)MathSciNetzbMATHGoogle Scholar
  76. 76.
    Wang, G., Chen, Y.: Fuzzy correspondences guided Gaussian mixture model for point set registration. Knowl. Based Syst. 136, 200–209 (2017)Google Scholar
  77. 77.
    Wang, N., Zhang, Y., Li, Z., Fu, Y., Liu, W., Jiang, Y.-G.: Pixel2Mesh: generating 3D mesh models from single RGB images. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11215, pp. 55–71. Springer, Cham (2018).  https://doi.org/10.1007/978-3-030-01252-6_4CrossRefGoogle Scholar
  78. 78.
    Wang, S., Suo, S., Ma, W.C., Pokrovsky, A., Urtasun, R.: Deep parametric continuous convolutional neural networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2589–2597 (2018)Google Scholar
  79. 79.
    Wang, Y., Sun, Y., Liu, Z., Sarma, S.E., Bronstein, M.M., Solomon, J.M.: Dynamic graph CNN for learning on point clouds. ACM Trans. Graph. (TOG) 38(5), 146 (2019)Google Scholar
  80. 80.
    Wu, W., Qi, Z., Fuxin, L.: PointConv: deep convolutional networks on 3D point clouds. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 9621–9630 (2019)Google Scholar
  81. 81.
    Xu, J., Ranftl, R., Koltun, V.: Accurate optical flow via direct cost volume processing. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1289–1297 (2017)Google Scholar
  82. 82.
    Yin, Z., Shi, J.: GeoNet: unsupervised learning of dense depth, optical flow and camera pose. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1983–1992 (2018)Google Scholar
  83. 83.
    Yu, D.: Fast rotation-free feature-based image registration using improved N-SIFT and GMM-based parallel optimization. IEEE Trans. Biomed. Eng. 63(8), 1653–1664 (2015)Google Scholar
  84. 84.
    Zhang, S., Yang, K., Yang, Y., Luo, Y., Wei, Z.: Non-rigid point set registration using dual-feature finite mixture model and global-local structural preservation. Pattern Recogn. 80, 183–195 (2018)Google Scholar
  85. 85.
    Zhou, Q.-Y., Park, J., Koltun, V.: Fast global registration. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9906, pp. 766–782. Springer, Cham (2016).  https://doi.org/10.1007/978-3-319-46475-6_47CrossRefGoogle Scholar
  86. 86.
    Zhou, Z., Zheng, J., Dai, Y., Zhou, Z., Chen, S.: Robust non-rigid point set registration using student’s-t mixture model. PLOS One 9(3), e91381 (2014)Google Scholar
  87. 87.
    Zou, Y., Luo, Z., Huang, J.-B.: DF-Net: unsupervised joint learning of depth and flow using cross-task consistency. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11209, pp. 38–55. Springer, Cham (2018).  https://doi.org/10.1007/978-3-030-01228-1_3CrossRefGoogle Scholar

Copyright information

© Springer Nature Switzerland AG 2020

Authors and Affiliations

  1. 1.CORIS InstituteOregon State UniversityCorvallisUSA
  2. 2.Nuro, Inc.Mountain ViewUSA

Personalised recommendations