Skip to main content

A Self-regulating Spatio-Temporal Filter for Volumetric Video Point Clouds

  • Conference paper
  • First Online:
Computer Vision, Imaging and Computer Graphics Theory and Applications (VISIGRAPP 2019)

Abstract

The following work presents a self-regulating filter that is capable of performing accurate upsampling of dynamic point cloud data sequences captured using wide-baseline multi-view camera setups. This is achieved by using two-way temporal projection of edge-aware upsampled point clouds while imposing coherence and noise filtering via a windowed, self-regulating noise filter. We use a state of the art Spatio-Temporal Edge-Aware scene flow estimation to accurately model the motion of points across a sequence and then, leveraging the spatio-temporal inconsistency of unstructured noise, we perform a weighted Hausdorff distance-based noise filter over a given window. Our results demonstrate that this approach produces temporally coherent, upsampled point clouds while mitigating both additive and unstructured noise. In addition to filtering noise, the algorithm is able to greatly reduce intermittent loss of pertinent geometry. The system performs well in dynamic real world scenarios with both stationary and non-stationary cameras as well as synthetically rendered environments for baseline study.

This publication has emanated from research conducted with the financial support of Science Foundation Ireland (SFI) under grant No. 15/RP/2776. We gratefully acknowledge the support of NVIDIA Corporation with the donation of the Titan Xp GPU used for this research.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Bao, L., Yang, Q., Jin, H.: Fast edge-preserving PatchMatch for large displacement optical flow. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3534–3541 (2014)

    Google Scholar 

  2. Basha, T., Moses, Y., Kiryati, N.: Multi-view scene flow estimation: a view centered variational approach. Int. J. Comput. Vision 101(1), 6–21 (2013)

    Article  MathSciNet  MATH  Google Scholar 

  3. Berjón, D., Pagés, R., Morán, F.: Fast feature matching for detailed point cloud generation. In: 2016 6th International Conference on Image Processing Theory Tools and Applications (IPTA), pp. 1–6. IEEE (2016)

    Google Scholar 

  4. Bouguet, J.Y.: Pyramidal implementation of the affine Lucas-Kanade feature tracker. Intel Corporation (2001)

    Google Scholar 

  5. Collet, A., et al.: High-quality streamable free-viewpoint video. ACM Trans. Graph. (ToG) 34(4), 69 (2015)

    Article  Google Scholar 

  6. Dollár, P., Zitnick, C.L.: Structured forests for fast edge detection. In: 2013 IEEE International Conference on Computer Vision (ICCV), pp. 1841–1848. IEEE (2013)

    Google Scholar 

  7. Dou, M., et al.: Motion2fusion: real-time volumetric performance capture. ACM Trans. Graph. (TOG) 36(6), 246 (2017)

    Article  Google Scholar 

  8. Dou, M., et al.: Fusion4d: real-time performance capture of challenging scenes. ACM Trans. Graph. (TOG) 35(4), 114 (2016)

    Article  Google Scholar 

  9. Farnebäck, G.: Two-frame motion estimation based on polynomial expansion. In: Bigun, J., Gustavsson, T. (eds.) SCIA 2003. LNCS, vol. 2749, pp. 363–370. Springer, Heidelberg (2003). https://doi.org/10.1007/3-540-45103-X_50

    Chapter  Google Scholar 

  10. Furukawa, Y., Ponce, J.: Accurate, dense, and robust multiview stereopsis. IEEE Trans. Pattern Anal. Mach. Intell. 32(8), 1362–1376 (2010)

    Article  Google Scholar 

  11. Gastal, E.S., Oliveira, M.M.: Domain transform for edge-aware image and video processing. ACM Trans. Graph. (ToG) 30, 69 (2011)

    Article  Google Scholar 

  12. Hartley, R.I., Zisserman, A.: Multiple View Geometry in Computer Vision. Cambridge University Press, New York (2004)

    Book  MATH  Google Scholar 

  13. Hu, Y., Song, R., Li, Y.: Efficient coarse-to-fine PatchMatch for large displacement optical flow. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5704–5712 (2016)

    Google Scholar 

  14. Huang, C.H., Boyer, E., Navab, N., Ilic, S.: Human shape and pose tracking using keyframes. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3446–3453 (2014)

    Google Scholar 

  15. Huang, H., Wu, S., Gong, M., Cohen-Or, D., Ascher, U., Zhang, H.: Edge-aware point set resampling. ACM Trans. Graph. 32, 9:1–9:12 (2013)

    MATH  Google Scholar 

  16. Kazhdan, M., Hoppe, H.: Screened poisson surface reconstruction. ACM Trans. Graph. (ToG) 32(3), 29 (2013)

    Article  MATH  Google Scholar 

  17. Klaudiny, M., Budd, C., Hilton, A.: Towards optimal non-rigid surface tracking. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012. LNCS, vol. 7575, pp. 743–756. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-33765-9_53

    Chapter  Google Scholar 

  18. Lang, M., Wang, O., Aydin, T.O., Smolic, A., Gross, M.H.: Practical temporal consistency for image-based graphics applications. ACM Trans. Graph. 31(4), 1–8 (2012)

    Article  Google Scholar 

  19. Li, H., Adams, B., Guibas, L.J., Pauly, M.: Robust single-view geometry and motion reconstruction. ACM Trans. Graph. (ToG) 28, 175 (2009)

    Google Scholar 

  20. Liu, Y., Dai, Q., Xu, W.: A point-cloud-based multiview stereo algorithm for free-viewpoint video. IEEE Trans. Visual Comput. Graph. 16(3), 407–418 (2010)

    Article  Google Scholar 

  21. Lowe, D.G.: Method and apparatus for identifying scale invariant features in an image and use of same for locating an object in an image, uS Patent 6,711,293, 23 March 2004

    Google Scholar 

  22. Luhmann, T., Robson, S., Kyle, S., Harley, I.: Close Range Photogrammetry. Wiley, New York (2007)

    Google Scholar 

  23. Moulon, P., Monasse, P., Marlet, R.: Adaptive structure from motion with a Contrario model estimation. In: Lee, K.M., Matsushita, Y., Rehg, J.M., Hu, Z. (eds.) ACCV 2012. LNCS, vol. 7727, pp. 257–270. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-37447-0_20

    Chapter  Google Scholar 

  24. Moynihan, M., Pagéés, R., Smolic, A.: Spatio-temporal upsampling for free viewpoint video point clouds. In: Proceedings of the 14th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications - Volume 5: VISAPP, pp. 684–692. INSTICC, SciTePress (2019). https://doi.org/10.5220/0007361606840692

  25. Mustafa, A., Kim, H., Guillemaut, J.Y., Hilton, A.: Temporally coherent 4D reconstruction of complex dynamic scenes. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 4660–4669, June 2016. https://doi.org/10.1109/CVPR.2016.504

  26. Mustafa, A., Hilton, A.: Semantically coherent co-segmentation and reconstruction of dynamic scenes. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 422–431 (2017)

    Google Scholar 

  27. Mustafa, A., Kim, H., Guillemaut, J.Y., Hilton, A.: General dynamic scene reconstruction from multiple view video. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 900–908 (2015)

    Google Scholar 

  28. Mustafa, A., Volino, M., Guillemaut, J.Y., Hilton, A.: 4D temporally coherent light-field video. In: 2017 International Conference on 3D Vision (3DV), pp. 29–37. IEEE (2017)

    Google Scholar 

  29. Myronenko, A., Song, X.: Point set registration: coherent point drift. IEEE Trans. Pattern Anal. Mach. Intell. 32(12), 2262–2275 (2010)

    Article  Google Scholar 

  30. Pagés, R., Amplianitis, K., Monaghan, D., Ondřej, J., Smolic, A.: Affordable content creation for free-viewpoint video and VR/AR applications. J. Vis. Commun. Image Representat. 53, 192–201 (2018). https://doi.org/10.1016/j.jvcir.2018.03.012. http://www.sciencedirect.com/science/article/pii/S1047320318300683

    Article  Google Scholar 

  31. Prada, F., Kazhdan, M., Chuang, M., Collet, A., Hoppe, H.: Spatiotemporal atlas parameterization for evolving meshes. ACM Trans. Graph. (TOG) 36(4), 58 (2017)

    Article  Google Scholar 

  32. Revaud, J., Weinzaepfel, P., Harchaoui, Z., Schmid, C.: EpicFlow: edge-preserving interpolation of correspondences for optical flow. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1164–1172 (2015)

    Google Scholar 

  33. Schaffner, M., Scheidegger, F., Cavigelli, L., Kaeslin, H., Benini, L., Smolic, A.: Towards edge-aware spatio-temporal filtering in real-time. IEEE Trans. Image Process. 27(1), 265–280 (2018)

    Article  MathSciNet  MATH  Google Scholar 

  34. Schönberger, J.L., Zheng, E., Frahm, J.-M., Pollefeys, M.: Pixelwise view selection for unstructured multi-view stereo. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9907, pp. 501–518. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46487-9_31

    Chapter  Google Scholar 

  35. Wedel, A., Brox, T., Vaudrey, T., Rabe, C., Franke, U., Cremers, D.: Stereoscopic scene flow computation for 3D motion understanding. Int. J. Comput. Vision 95(1), 29–51 (2011)

    Article  MATH  Google Scholar 

  36. Wu, S., Huang, H., Gong, M., Zwicker, M., Cohen-Or, D.: Deep points consolidation. ACM Trans. Graph. (ToG) 34(6), 176 (2015)

    Google Scholar 

  37. Yu, L., Li, X., Fu, C.W., Cohen-Or, D., Heng, P.A.: PU-NET: point cloud upsampling network. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2790–2799 (2018)

    Google Scholar 

  38. Zollhöfer, M., et al.: Real-time non-rigid reconstruction using an RGB-D camera. ACM Trans. Graph. (ToG) 33(4), 156 (2014)

    Article  Google Scholar 

  39. Zou, D., Tan, P.: CoSLAM: collaborative visual SLAM in dynamic environments. IEEE Trans. Pattern Anal. Mach. Intell. 35(2), 354–366 (2013)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Matthew Moynihan .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Moynihan, M., Pagés, R., Smolic, A. (2020). A Self-regulating Spatio-Temporal Filter for Volumetric Video Point Clouds. In: Cláudio, A., et al. Computer Vision, Imaging and Computer Graphics Theory and Applications. VISIGRAPP 2019. Communications in Computer and Information Science, vol 1182. Springer, Cham. https://doi.org/10.1007/978-3-030-41590-7_16

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-41590-7_16

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-41589-1

  • Online ISBN: 978-3-030-41590-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics