Skip to main content
Log in

Multi-Camera Active-Vision for Markerless Shape Recovery of Unknown Deforming Objects

  • Published:
Journal of Intelligent & Robotic Systems Aims and scope Submit manuscript

Abstract

A novel multi-camera active-vision reconfiguration method is proposed for the markerless shape recovery of unknown deforming objects. The proposed method implements a model fusion technique to obtain a complete 3D mesh-model via triangulation and a visual hull. The model is tracked using an adaptive particle filtering algorithm, yielding a deformation estimate that can, then, be used to reconfigure the cameras for improved surface visibility. The objective of reconfiguration is maximization of the total surface area visible through stereo triangulation. The surface area based objective function directly relates to maximizing the accuracy of the shape recovered, as stereo triangulation is more accurate than visual hull building when the number of viewpoints is limited. The reconfiguration process comprises workspace discretization, visibility estimation, optimal stereo-pose selection, and path planning to ensure 2D tracked feature consistency. In contrast to other reconfiguration techniques that rely on a priori known, and at times static, object models, our method focuses on a priori unknown deforming objects. The proposed method operates on-line and has been shown to outperform static-camera based systems through extensive simulations and experiments with an increased surface visibility in the presence of occluding obstacles.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

Abbreviations

C M :

A matrix of the object model’s center point repeated k-times [k × 3].

K :

An indexing matrix of all positional combinations [pmax × ck].

K :

A filtered indexing matrix [pred × ck].

L(t):

The system and workspace constraints at demand instant t.

M :

The object model at demand instant t.

M + :

The expected object model deformation at the next demand instant.

O + :

The expected obstacle model at the next demand instant

P c(t):

The set of camera parameters at demand instant t.

S(t):

The surface area of the object at demand instant t.

R(k):

The estimated surface area visibility score for the kth positional combination.

V :

A matrix of test positional vectors for reconfiguration [nv × 3].

V (t + 1):

The expected model visibility at the next demand instant.

V map :

The normalized visibility proportion mapping matrix [np × nv].

\(\textbf {V}_{\textbf {map}}^{*}\) :

A subset of Vmap [np × cn].

X test :

A matrix of test points from model M+ [k × 3]

\(\textbf {X}_{\textbf {test}}^{*}\) :

The set of test points projected onto an arbitrary plane [k × 3].

Y :

A Boolean matrix of test point visibly [k × 4].

a j :

The area of the jth polygon in the model at the current demand instant.

c k :

The number of stereo camera pairs at the current demand instant.

\(c_{k}^{*}\) :

The number of filtered stereo camera pairs at the current demand instant.

c M :

The center point of the object model [1 × 3].

\(\mathbf {c}_{M}^{+}\) :

The projected center point of the object model [1 × 3].

\(\mathbf {c}_{M}^{*}\) :

A point in space produced by the triangulation of two rays associated with the object’s center point and stereo-pair placement, [1 × 3]

c n :

The number of stereo camera pairs at the current demand instant

c o b s :

The center point of the obstacle [1 × 3].

d d :

A vector of distances along the test position vector from the model center to all test points.

\(d_{thresh}^{d} \) :

The maximum allowable distance for dd.

d w :

A vector of distances between the model center and all test points on a plane.

\(d_{thresh}^{w} \) :

The maximum allowable distance for dw.

g :

The distance from the object’s bounding rectangle to the image edge in pixels.

h :

The stereo camera pair placement score.

k :

An arbitrary row of K matrix [1 × ck].

n p :

The number of model polygons at the current demand instant.

n v :

The number of test positional vectors.

p :

A point on a test positioning vector [1 × 3].

p c :

The mean stereo camera pair’s position at the current demand instant [1 × 3].

\(\mathbf {p}_{c}^{+} \) :

The mean stereo camera pair’s position at the next demand instant [1 × 3].

\(\mathbf {p}_{b}^{1} ,\mathbf {p}_{b}^{2} \) :

The Bézier-curve control points [1 × 3].

p max :

The total number of possible positional combinations.

p r e d :

The number of positional combinations used.

q :

The unit vector normal of an arbitrary test point.

vdepth :

A vector of Boolean visibility values for each test point [k × 1].

vHPR :

A vector of Boolean visibility values for each test point [k × 1].

vnorm :

A vector of Boolean visibility values for each test point [k × 1].

vtest :

A unit test vector from V [1 × 3]

vwidth :

A vector of Boolean visibility values for each test point [k × 1].

y :

A vector of logical conjunction of Y across rows [k × 1].

β :

The maximum path motion angle.

𝜃 min :

The minimum angular separation between two sets of stereo camera pairs.

\(\sigma _{thresh}^{d} \) :

The standard deviation threshold value for depth operator.

\(\sigma _{thresh}^{w} \) :

The standard deviation threshold value for depth operator.

ϕ :

The angular separation between the test positional vector and a test point normal.

ϕ max :

The maximum angle between a test positional vector and point normal.

References

  1. Song, B., Ding, C., Kamal, A., Farrell, J.A., Roy-Chowdhury, A.K.: Distributed camera networks. IEEE Signal Process. Mag. 28(3), 20–31 (2011)

    Article  Google Scholar 

  2. Piciarelli, C., Esterle, L., Khan, A., Rinner, B., Foresti, G.L.: Dynamic reconfiguration in camera networks: a short survey. IEEE Trans. Circuits Syst. Video Technol. 26(5), 965–977 (2016)

    Article  Google Scholar 

  3. Ilie, A., Welch, G., Macenko, M.: A Stochastic Quality Metric for Optimal Control of Active Camera Network Configurations for 3D Computer Vision Tasks. In: ECCV Workshop on Multicamera and Multimodal Sensor Fusion Algorithms and Applications, pp 1–12 (2008)

  4. Cowan, C.K.: Model-Based Synthesis of Sensor Location. In: IEEE International Conference on Robotics and Automation (ICRA), pp 900–905 (1988)

  5. Tarabanis, K.A., Tsai, R.Y., Abrams, S.: Planning Viewpoints that Simultaneously Satisfy Several Feature Detectability Constraints for Robotic Vision. In: International Conference on Advanced Robotics “Robots in Unstructured Environments”, vol. 2, pp 1410–1415 (1991)

  6. Qureshi, F.Z., Terzopoulos, D.: Surveillance in Virtual Reality: System Design and Multi-Camera Control. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp 1–8 (2007)

  7. Piciarelli, C., Micheloni, C., Foresti, G.L.: PTZ Camera Network Reconfiguration. In: International Conference on Distributed Smart Cameras (ICDSC), pp 1–7 (2009)

  8. Collins, R.T., Amidi, O., Kanade, T.: An Active Camera System for Acquiring Multi-View Video. In: International Conference on Image Processing, pp 1–4 (2002)

  9. Chen, S.Y., Li, Y.F.: Vision sensor planning for 3-d model acquisition. IEEE Trans. Syst. Man, Cybern. Part B Cybern. 35(5), 894–904 (2005)

    Article  MathSciNet  Google Scholar 

  10. Schramm, F., Geffard, F., Morel, G., Micaelli, A.: Calibration Free Image Point Path Planning Simultaneously Ensuring Visibility and Controlling Camera Path. In: IEEE International Conference on Robotics and Automation (ICRA), pp 2074–2079 (2007)

  11. Amamra, A., Amara, Y., Benaissa, R., Merabti, B.: Optimal Camera Path Planning for 3D Visualisation. In: SAI Computing Conference, pp 388–393 (2016)

  12. Mir-Nasiri, N.: Camera-Based 3D Object Tracking and Following Mobile Robot. In: IEEE Conference on Robotics, Automation and Mechatronics, pp 1–6 (2006)

  13. Abrams, S., Allen, P.K., Tarabanis, K.A.: Dynamic Sensor Planning. In: IEEE International Conference on Robotics and Automation (ICRA), vol. 2, pp 605–610 (1993)

  14. Tarabanis, K.A., Tsai, R.Y., Allen, P.K.: The MVP sensor planning system for robotic vision tasks. IEEE Trans. Robot. Autom. 11(1), 72–85 (1995)

    Article  Google Scholar 

  15. Christie, M., Machap, R., Normand, J.-M., Olivier, P., Pickering, J.: Virtual Camera Planning: a Survey. In: International Symposium on Smart Graphics, pp 40–52 (2005)

    Chapter  Google Scholar 

  16. Bakhtari, A., Naish, M.D., Eskandari, M., Croft, E.A., Benhabib, B.: Active-vision-based multisensor surveillance-an implementation. IEEE Trans. Syst. Man, Cybern. Part C Appl. Rev. 36(5), 668–680 (2006)

    Article  Google Scholar 

  17. MacKay, M.D., Fenton, R.G., Benhabib, B.: Pipeline-architecture based real-time active-vision for human-action recognition. J. Intell. Robot. Syst. 72(3–4), 385–407 (2013)

    Article  Google Scholar 

  18. Schacter, D.S., Donnici, M., Nuger, E., MacKay, M.D., Benhabib, B.: A multi-camera active-vision system for deformable-object-motion capture. J. Intell. Robot. Syst. 75(3), 413–441 (2014)

    Article  Google Scholar 

  19. Herrera, J.L. A., Chen, X.: Consensus Algorithms in a Multi-Agent Framework to Solve PTZ Camera Reconfiguration in UAVs. In: International Conference on Intelligent Robotics and Applications, pp 331–340 (2012)

  20. Konda, K.R., Conci, N.: Real-Time Reconfiguration of PTZ Camera Networks Using Motion Field Entropy and Visual Coverage. In: Proceedings of the International Conference on Distributed Smart Cameras, p 18 (2014)

  21. Natarajan, P., Hoang, T.N., Low, K.H., Kankanhalli, M., Hoang, T.N., Low, K.H.: Decision-Theoretic Coordination and Control for Active Multi-Camera Surveillance in Uncertain, Partially Observable Environments. In: International Conference on Distributed Smart Cameras (ICDSC), pp 1–6 (2012)

  22. Song, B., Soto, C., Roy-Chowdhury, A.K., Farrell, J.A.: Decentralized Camera Network Control Using Game Theory. In: 2008 Second ACM/IEEE International Conference on Distributed Smart Cameras, pp 1–8 (2008)

  23. Ding, C., Song, B., Morye, A., Farrell, J.A., Roy-Chowdhury, A.K.: Collaborative sensing in a distributed PTZ camera network. IEEE Trans. Image Process. 21(7), 3282–3295 (2012)

    Article  MathSciNet  MATH  Google Scholar 

  24. Del Bimbo, A., Dini, F., Lisanti, G., Pernici, F.: Exploiting distinctive visual landmark maps in pan-tit-zoom camera networks. Comput. Vis. Image Underst. 114(6), 611–623 (2010)

    Article  Google Scholar 

  25. Piciarelli, C., Micheloni, C., Foresti, G.L.: Occlusion-Aware Multiple Camera Reconfiguration. In: International Conference on Distributed Smart Cameras (ICDSC), p 88 (2010)

  26. Indu, S., Chaudhury, S., Mittal, N.R., Bhattacharyya, A.: Optimal Sensor Placement for Surveillance of Large Spaces. In: International Conference on Distributed Smart Cameras (ICDSC), pp 1–8 (2009)

  27. Schwager, M., Julian, B.J., Angermann, M., Rus, D.: Eyes in the sky: decentralized control for the deployment of robotic camera networks. Proc. IEEE 99(9), 1541–1561 (2011)

    Article  Google Scholar 

  28. Konda, K.R., Rosani, A., Conci, N., De Natale, F.G.B.: Smart Camera Reconfiguration in Assisted Home Environments for Elderly Care. In: European Conference on Computer Vision (ECCV), pp 45–58 (2014)

    Google Scholar 

  29. Tarabanis, K.A., Allen, P.K., Tsai, R.Y.: A survey of sensor planning in computer vision. IEEE Trans. Robot. Autom. 11(1), 86–104 (1995)

    Article  Google Scholar 

  30. Pito, R.: A solution to the next best view problem for automated surface acquisition. IEEE Trans. Pattern Anal. Mach. Intell. 21(10), 1016–1030 (1999)

    Article  Google Scholar 

  31. Wong, L.M., Dumont, C., Abidi, M.A.: Next Best View System in a 3D Object Modeling Task. In: Proceedings 1999 IEEE International Symposium on Computational Intelligence in Robotics and Automation. CIRA’99 (Cat. No.99EX375), pp 306–311 (1999)

  32. Bircher, A., Kamel, M., Alexis, K., Oleynikova, H., Siegwart, R.: Receding Horizon ‘Next-Best-View’ Planner for 3D Exploration. In: 2016 IEEE International Conference on Robotics and Automation (ICRA), pp 1462–1468 (2016)

  33. Liska, C., Sablatnig, R.: Adaptive 3D Acquisition Using Laser Light. In: Czech Pattern Recognition Workshop, pp 111–117 (2000)

  34. Chan, M.-Y., Mak, W.-H., Qu, H.: An Efficient Quality-Based Camera Path Planning Method for Volume Exploration. In: International Symposium on Visual Computing, pp 12–21 (2008)

    Chapter  Google Scholar 

  35. Benhamou, F., Goualard, F., Languénou, É., Christie, M.: Interval constraint solving for camera control and motion planning. ACM Trans. Comput. Log. 5(4), 732–767 (2004)

    Article  MathSciNet  MATH  Google Scholar 

  36. Assa, J., Wolf, L., Cohen-Or, D.: The virtual director: a correlation-based online viewing of human motion. Comput. Graph. Forum 29(2), 595–604 (2010)

    Article  Google Scholar 

  37. Naish, M.D., Croft, E.A., Benhabib, B.: Simulation-Based Sensing-System Configuration for Dynamic Dispatching. In: IEEE International Conference on Systems, Man and Cybernetics, vol. 5, pp 2964–2969 (2001)

  38. Naish, M.D., Croft, E.A., Benhabib, B.: Coordinated dispatching of proximity sensors for the surveillance of manoeuvring targets. Robot. Comput. Integr. Manuf. 19(3), 283–299 (2003)

    Article  Google Scholar 

  39. Bakhtari, A., Benhabib, B.: An active vision system for multitarget surveillance in dynamic environments. IEEE Trans. Syst. Man, Cybern. Part B Cybern. 37(1), 190–198 (2007)

    Article  Google Scholar 

  40. Bakhtari, A., MacKay, M.D., Benhabib, B.: Active-vision for the autonomous surveillance of dynamic, multi-object environments. J. Intell. Robot. Syst. 54(4), 567–593 (2009)

    Article  MATH  Google Scholar 

  41. Tan, J.K., Ishikawa, S., Yamaguchi, I., Naito, T.: Yokota, m.: 3-D recovery of human motion by mobile stereo cameras. Artif. Life Robot. 10(1), 64–68 (2006)

    Article  Google Scholar 

  42. Malik, R., Malik, R., Bajcsy, P., Bajcsy, P.: Automated Placement of Multiple Stereo Cameras. In: ECCV Workshop on Omnidirectional Vision, Camera Networks and Non-Classical Cameras (2008)

  43. Hasler, N., Rosenhahn, B., Thormählen, T., Wand, M., Gall, J., Seidel, H.P.: Markerless Motion Capture with Unsynchronized Moving Cameras. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp 224–231 (2009)

  44. MacKay, M.D., Fenton, R.G., Benhabib, B.: Time-varying-geometry object surveillance using a multi-camera active-vision system. Int. J. Smart Sens. Intell. Syst. 1(3), 679–704 (2008)

    Google Scholar 

  45. MacKay, M.D., Fenton, R.G., Benhabib, B.: Multi-camera active surveillance of an articulated human form - an implementation strategy. Comput. Vis. Image Underst. 115(10), 1395–1413 (2011)

    Article  Google Scholar 

  46. Hofmann, M., Gavrila, D.M.: Multi-view 3D human pose estimation in complex environment. Int. J. Comput. Vis. 96(1), 103–124 (2012)

    Article  MathSciNet  Google Scholar 

  47. Schacter, D.S.: Multi-Camera Active-Vision System reconfiguration for deformable object motion capture. University of toronto (2014)

  48. Zhao, W., Gao, S., Lin, H.: A robust hole-filling algorithm for triangular mesh. Vis. Comput. 23(12), 987–997 (2007)

    Article  Google Scholar 

  49. Hilton, A., Stoddart, A.J., Illingworth, J., Windeatt, T.: Reliable surface reconstruction from multiple range images, pp. 117–126 (1996)

  50. Davis, J., Marschner, S.R., Garr, M., Levoy, M.: Filling Holes in Complex Surfaces Using Volumetric Diffusion. In: Proceedings. First International Symposium on 3D Data Processing Visualization and Transmission, pp 428–861 (2002)

  51. Kalal, Z., Matas, J., Mikolajczyk, K.: Online Learning of Robust Object Detectors during Unstable Tracking. In: 2009 IEEE 12Th Int. Conf. Comput. Vis. Work. ICCV Work. 2009, pp 1417–1424 (2009)

  52. Forsyth, D.A., Ponce, J.: Computer Vision: a Modern Approach, 2Nd Edn. Pearson, London (2012)

    Google Scholar 

  53. Hughes, J.F. et al.: Computer Graphics: Principles and Practice, 3Rd Edn. Addison-Wesley Professional, Boston (2013)

    Google Scholar 

  54. Laurentini, A.: Visual hull concept for silhouette-based image understanding. IEEE Trans. Pattern Anal. Mach. Intell. 16(2), 150–162 (1994)

    Article  Google Scholar 

  55. Terauchi, T., Oue, Y., Fujimura, K.: A Flexible 3D Modeling System Based on Combining Shape-From-Silhouette with Light-Sectioning Algorithm. In: International Conference on 3-D Digital Imaging and Modeling, pp 196–203 (2005)

  56. Hernández Esteban, C., Schmitt, F.: Silhouette and stereo fusion for 3d object modeling. Comput. Vis. Image Underst. 96(3), 367–392 (2004)

    Article  Google Scholar 

  57. Cremers, D., Kolev, K.: Multiview stereo and silhouette consistency via convex functionals over convex domains. IEEE Trans. Pattern Anal. Mach. Intell. 33(6), 1161–1174 (2011)

    Article  Google Scholar 

  58. Liu, Y., Dai, Q., Xu, W.: A point-cloud-based multiview stereo algorithm for free-viewpoint video. IEEE Trans. Vis. Comput. Graph. 16(3), 407–418 (2010)

    Article  Google Scholar 

  59. Hebert, P. et al.: Combined Shape, Appearance and Silhouette for Simultaneous Manipulator and Object Tracking. In: International Conference on Robotics and Automation, pp 2405–2412 (2012)

  60. Song, P., Wu, X., Wang, M.Y.: Volumetric stereo and silhouette fusion for image-based modeling. Vis. Comput. 26(12), 1435–1450 (2010)

    Article  Google Scholar 

  61. Nuger, E., Benhabib, B.: Multicamera fusion for shape estimation and visibility analysis of unknown deforming objects. J. Electron. Imaging 25(4), 41009 (2016)

    Article  Google Scholar 

  62. Huang, C.-H., Cagniart, C., Boyer, E., Ilic, S.: A bayesian approach to multi-view 4D modeling. Int. J. Comput. Vis. 116(2), 115–135 (2016)

    Article  MathSciNet  Google Scholar 

  63. Matusik, W., Buehler, C., Raskar, R., Gortler, S.J., McMillan, L.: Image-Based Visual Hulls. In: Proceedings of the Annual Conference on Computer Graphics and Interactive Techniques, SIGGRAPH, pp 369–374 (2000)

  64. Corazza, S., Mündermann, L., Gambaretto, E., Ferrigno, G., Andriacchi, T.P.: Markerless motion capture through visual hull, articulated ICP and subject specific model generation. Int. J. Comput. Vis. 87(1–2), 156–169 (2010)

    Article  Google Scholar 

  65. Li, Q., Xu, S., Xia, D., Li, D.: A Novel 3D Convex Surface Reconstruction Method Based on Visual Hull. In: Pattern Recognition and Computer Vision, vol. 8004, p 800412 (2011)

  66. Roshnara Nasrin, P.P., Jabbar, S.: Efficient 3D Visual Hull Reconstruction Based on Marching Cube Algorithm. In: International Conference on Innovations in Information, Embedded and Communication Systems (ICIIECS), pp 1–6 (2015)

  67. Mercier, B., Meneveaux, D., Fournier, A.: A framework for automatically recovering object shape, reflectance and light sources from calibrated images. Int. J. Comput. Vis. 73(1), 77–93 (2007)

    Article  Google Scholar 

  68. Lorensen, W.E., Cline, H.E.: Marching Cubes: a High Resolution 3D Surface Construction Algorithm. In: Proceedings of the 14Th Annual Conference on Computer Graphics and Interactive Techniques (SIGGRAPH), vol. 21, no. 4, pp 163–169 (1987)

  69. Lazebnik, S., Furukawa, Y., Ponce, J.: Projective visual hulls. Int. J. Comput. Vis. 74(2), 137–165 (2007)

    Article  Google Scholar 

  70. Tomasi, C., Kanade, T.: Shape and motion from image streams: a factorization method. Proc. Natl. Acad. Sci. 90(21), 9795–9802 (1993)

    Article  Google Scholar 

  71. Pollefeys, M., Vergauwen, M., Cornelis, K., Tops, J., Verbiest, F., Van Gool, L.: Structure and Motion from Image Sequences. In: Proceedings of the Conference on Optical 3D Measurement Techniques, pp 251–258 (2001)

  72. Lhuillier, M., Quan, L.: A quasi-dense approach to surface reconstruction from uncalibrated images. IEEE Trans. Pattern Anal. Mach. Intell. 27(3), 418–433 (2005)

    Article  Google Scholar 

  73. Snavely, N., Seitz, S.M., Szeliski, R.: Photo tourism: exploring photo collections in 3d. ACM Trans. Graph. 25(3), 835–846 (2006)

    Article  Google Scholar 

  74. Del Bue, A., Agapito, L.: Non-rigid stereo factorization. Int. J. Comput. Vis. 66(2), 193–207 (2006)

    Article  Google Scholar 

  75. Huang, Y., Tu, J., Huang, T.S.: A Factorization Method in Stereo Motion for Non-Rigid Objects. In: IEEE International Conference on Acoustics, Speech and Signal Processing, No. 1, pp 1065–1068 (2008)

  76. Bay, H., Ess, A., Tuytelaars, T., Van Gool, L.: Speeded up robust features (SURF). Comput. Vis. Image Underst. 110(3), 346–359 (2008)

    Article  Google Scholar 

  77. Lowe, D.G.: Distinctive image features from scale-invariant keypoints. Int. J. Comput. Vis. 60(2), 91–110 (2004)

    Article  MathSciNet  Google Scholar 

  78. Furukawa, Y., Ponce, J.: Accurate, dense, and robust multiview stereopsis. IEEE Trans. Pattern Anal. Mach. Intell. 32(8), 1362–1376 (2010)

    Article  Google Scholar 

  79. Kalman, R.E.: A new approach to linear filtering and prediction problems 1. ASME Trans. J. Basic Eng. 82 (Series D), 35–45 (1960)

    Article  Google Scholar 

  80. Welch, G., Bishop, G.: An introduction to the Kalman filter. In Pract. 7(1), 1–16 (2006)

    Google Scholar 

  81. Ristic, B., Arulampalam, S., Gordon, N.: A Tutorial on Particle Filters. In: Beyond the Kalman Filter: Particle Filter for Tracking Applications, pp 35–62. Artech House, Boston (2004)

  82. Sui, Y., Zhang, L.: Robust tracking via locally structured representation. Int. J. Comput. Vis. 119(2), 110–144 (2016)

    Article  MathSciNet  Google Scholar 

  83. Gonzales, C., Dubuisson, S.: Combinatorial resampling particle filter: an effective and efficient method for articulated object tracking. Int. J. Comput. Vis. 112(3), 255–284 (2015)

    Article  Google Scholar 

  84. Kwolek, B., Krzeszowski, T., Gagalowicz, A., Wojciechowski, K., Josinski, H.: Real-Time Multi-View Human Motion Tracking Using Particle Swarm Optimization with Resampling. In: International Conference on Articulated Motion and Deformable Objects (AMDO), pp 92–101 (2012)

    Chapter  Google Scholar 

  85. Zhang, X., Hu, W., Xie, N., Bao, H., Maybank, S.: A robust tracking system for low frame rate video. Int. J. Comput. Vis. 115(3), 279–304 (2015)

    Article  MathSciNet  Google Scholar 

  86. Maung, T.H.H.: Real-time hand tracking and gesture recognition system using neural networks. World Acad. Sci. Eng. Technol. 50, 466–470 (2009)

    Google Scholar 

  87. Agarwal, A., Datla, S., Tyagi, B., Niyogi, R.: Novel design for real time path tracking with computer vision using neural networks. Int. J. Comput. Vis. Robot. 1(4), 380–391 (2010)

    Article  Google Scholar 

  88. Katz, S., Tal, A., Basri, R.: Direct visibility of point sets. ACM Trans. Graph. 26(3), 1–12 (2007)

    Article  Google Scholar 

  89. Möller, T., Trumbore, B.: Fast, minimum storage ray-triangle intersection. J. Graph. Tools 2(1), 21–28 (1997)

    Article  Google Scholar 

  90. Kim, W.S., Ansar, A.I., Steele, R.D., Steinke, R.C.: Performance Analysis and Validation of a Stereo Vision System. In: IEEE International Conference on Systems, Man and Cybernetics, vol. 2, pp 1409–1416 (2005)

  91. Blender Online Community: Blender - a 3D modelling and rendering package. Blender Institute, Amsterdam (2016)

  92. Vedaldi, A., Fulkerson, B.: {VLFeat}- an Open and Portable Library of Computer Vision Algorithms. In: ACM International Conference on Multimedia (2010)

  93. Bouguet, J.-Y.: Camera calibration toolbox for matlab (2004)

  94. MacKay, M.D., Fenton, R.G., Benhabib, B.: Active Vision for Human Action Sensing. In: Technological Developments in Education and Automation, pp 397–402. Springer (2010)

Download references

Acknowledgements

The authors would like to acknowledge the support received, in part, by the Natural Sciences and Engineering Research Council of Canada (NSERC).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Evgeny Nuger.

Appendices

Appendix A

This appendix briefly outlines the model generation and deformation estimation methods. The nature of the problem requires the model generation method to yield the most accurate and complete model of the target object without a priori knowledge of the object’s identity. A generic approach to the problem may include cases where cameras are poorly positioned around the target object, thus, triangulation methods recovering only incomplete surface patches while a visual hull would over-estimate the bounding volume. In contrast, a fusion technique that combines triangulation with a visual hull would produce a model with highly accurate triangulated surface patches and a complete estimate of object’s volume.

Model generation for a priori unknown objects implies that an accurate measurement between the recovered shape and the true model is not possible during run-time from the system’s perspective. The model’s accuracy could be measured for test objects whose model is known to the user, but during the system’s run-time, a ground truth model is unavailable. Therefore, the model generation method must inherently attempt to maximize the recovery accuracy and completeness through the implemented shape recovery technique. The deformation estimation method receives the recovered, solid object model and current camera parameters from the model generation method and applies an adaptive particle filtering algorithm to estimate the deformation.

The model generation method comprises three steps: surface-patch triangulation through stereo-camera pairs, visual hull carving, and fusion. The surface patch triangulation yields a set of surface patches that represent stereo-visible regions of the target object based on the stereo-cameras position relative to the object. Stereo-triangulation is the most accurate approach to recover surface information for a priori unknown objects for a multi-camera system.

The model generation and deformation estimation methods were validated in our work through multiple simulated experiments and compared to existing methods in the literature. The methods were compared to the fusion algorithm developed by Li et al. [65], wherein their reconstruction method was shown to recover both concave and convex geometries. In order to provide a comparative analysis between the model generation method proposed herein and that of Li et al., multiple simulated experiments are presented below for the model generation and deformation estimation of multiple a priori unknown objects deforming in the workspace. The comparison also varies the number of cameras available in the system to illustrate a camera-saturated workspace approach, otherwise known as a brute-force solution to the camera placement problem. It is noted that the brute-force approach of increasing the number of cameras in the system is both computationally expensive and results in a much more constricted workspace due to the presence of the extra cameras.

The comparison of the methods presented herein includes three unique object deformations with five camera configurations. The initial object model and final deformation for Simulation A are presented in Fig. 20, Simulation B in Fig. 21, and Simulation C in Fig. 22. The camera configurations were numbered I to VI and are presented in Figs. 2324 and 25, respectively. The methods were compared based on the error between the generated model’s total surface area versus the ground truth model, and the error between their volumes. The comparative results are presented in Table 3. One major difference must be noted, namely, the errors calculated for the method developed by Li et al. were for the model recovered at the current demand instant, while the errors for chosen model generation method were based on the estimated deformation for the next demand instant. The comparisons illustrate the accuracy of the chosen model generation and deformation estimation methods when compared to other methods available in the literature.

Fig. 20
figure 20

Simulation A, top row – initial object deformation, bottom row – final object deformation [61]

Fig. 21
figure 21

Simulation B, top row – initial object deformation, bottom row – final object deformation [61]

Fig. 22
figure 22

Simulation C, top row – initial object deformation, bottom row – final object deformation [61]

Fig. 23
figure 23

Camera configuration I (left), II (right) [61]

Fig. 24
figure 24

Camera configuration III (left), IV (right) [61]

Fig. 25
figure 25

Camera configuration V (left), VI (right) [61]

Table 3 Comparative results, normalized errors [61]

A final comparison is presented in Table 4, wherein the chosen model generation and deformation estimation method was compared to itself with the expectation that the target object was a priori known and assumed to have a fixed surface area. Overall, the known model method showed less error except for the case where the object’s surface area changed in Simulation A.

Table 4 Comparative results with known model [61]

Appendix: B

The figures illustrate the visibility results for 12 supplementary simulations. The simulations were run using the same deforming object as in Section 4. However, for these simulations, the camera models were based on the calibrated camera parameters used in the experiment, namely, a 18 mm focal length, 4:3 aspect ratio sensor, and lens distortion. To set up these simulations, the camera and obstacle positions were scaled in the workspace. The results followed the same trend as the simulations in Section 4, but produced slightly lower visibility metric with increased variance. This disparity can be attributed to the added lens distortions and the impact of the foreshortening effect of the short focal length lens.

Appendix: C

The following figures illustrate the (bird-eye-view) deformation sequences and subsequent camera reconfigurations at select demand instants for the simulations described in Section 4 above. The estimated visible polygons of the object model are colored green to illustrate the recovered stereo-surface area of the target object.

Figures 28, and 29 illustrate every odd frame of the first simulation set for the static camera placement.

Fig. 26
figure 26

Surface area results for Simulations 1–6, respectively

Fig. 27
figure 27

Surface area results for Simulations 7–12, respectively

Fig. 28
figure 28

Static camera Simulation Frames {1, 3, 5, 7, 9, 11}, respectively

Fig. 29
figure 29

Static camera Simulation Frames {13, 15, 17, 19}, respectively

Figures 30, and 31 illustrate every odd frame of the first simulation set for the reconfiguration camera placement through the proposed method.

Fig. 30
figure 30

Reconfiguration through proposed method, Simulation Frames {1, 3, 5, 7, 9, 11}, respectively

Fig. 31
figure 31

Reconfiguration through proposed method, Simulation Frames {13, 15, 17, 19}, respectively

Figures 32 and 33 illustrate every odd frame of the first simulation set for the ideal reconfiguration camera placement through the proposed method.

Fig. 32
figure 32

Ideal camera placement, Simulation Frames {1, 3, 5, 7, 9, 11}, respectively

Fig. 33
figure 33

Ideal camera placement, Simulation Frames {13, 15, 17, 19}, respectively

Tables 5 and 6 represent the x and y global positions of all six cameras in the first simulation relative world reference frame upon which the target object was centered.

Table 5 Camera X,Y positions for reconfiguration Simulation 1
Table 6 Camera X,Y positions for ideal reconfiguration Simulation 1

Appendix: D

The following figures illustrate the (bird-eye-view) deformation sequences and subsequent camera reconfigurations at select demand instants for the experiments described in Section 5 above. The estimated visible polygons of the object model are colored green to illustrate the recovered stereo-surface area of the target object. The remainder of the recovered model that was not estimated as visible is colored red.

Figures 34353637 and 38 illustrate every odd frame of experiments 1–5, and include the comparison of the reconfiguration through the proposed method to the performance of the static method.

Fig. 34
figure 34

Experiment 1, odd Frames 1–19, comparison between reconfiguration and static cameras

Fig. 35
figure 35

Experiment 2, odd Frames 1–19, comparison between reconfiguration and static cameras

Fig. 36
figure 36

Experiment 3, odd Frames 1–19, comparison between reconfiguration and static cameras

Fig. 37
figure 37

Experiment 4, odd Frames 1–19, comparison between reconfiguration and static cameras

Fig. 38
figure 38

Experiment 5, odd Frames 1–19, comparison between reconfiguration and static cameras

Table 7 represents the x and y global positions of all six cameras in the fifth experiment relative world reference frame upon which the target object was centered.

Table 7 Camera X,Y positions for Reconfiguration Experiment 5

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Nuger, E., Benhabib, B. Multi-Camera Active-Vision for Markerless Shape Recovery of Unknown Deforming Objects. J Intell Robot Syst 92, 223–264 (2018). https://doi.org/10.1007/s10846-018-0773-0

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10846-018-0773-0

Keywords

Navigation