Multi-Camera Active-Vision for Markerless Shape Recovery of Unknown Deforming Objects

Nuger, Evgeny; Benhabib, Beno

doi:10.1007/s10846-018-0773-0

Multi-Camera Active-Vision for Markerless Shape Recovery of Unknown Deforming Objects

Published: 09 February 2018

Volume 92, pages 223–264, (2018)
Cite this article

Journal of Intelligent & Robotic Systems Aims and scope Submit manuscript

278 Accesses
6 Citations
Explore all metrics

Abstract

A novel multi-camera active-vision reconfiguration method is proposed for the markerless shape recovery of unknown deforming objects. The proposed method implements a model fusion technique to obtain a complete 3D mesh-model via triangulation and a visual hull. The model is tracked using an adaptive particle filtering algorithm, yielding a deformation estimate that can, then, be used to reconfigure the cameras for improved surface visibility. The objective of reconfiguration is maximization of the total surface area visible through stereo triangulation. The surface area based objective function directly relates to maximizing the accuracy of the shape recovered, as stereo triangulation is more accurate than visual hull building when the number of viewpoints is limited. The reconfiguration process comprises workspace discretization, visibility estimation, optimal stereo-pose selection, and path planning to ensure 2D tracked feature consistency. In contrast to other reconfiguration techniques that rely on a priori known, and at times static, object models, our method focuses on a priori unknown deforming objects. The proposed method operates on-line and has been shown to outperform static-camera based systems through extensive simulations and experiments with an increased surface visibility in the presence of occluding obstacles.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A Bayesian Approach to Multi-view 4D Modeling

Article 25 June 2015

Improved Structure from Motion Using Fiducial Marker Matching

Real-Time Large-Scale Dense 3D Reconstruction with Loop Closure

Abbreviations

C _M :: A matrix of the object model’s center point repeated k-times [k × 3].
K :: An indexing matrix of all positional combinations [p_max × c_k].
K ^∗ :: A filtered indexing matrix [p_red × c_k].
L(t):: The system and workspace constraints at demand instant t.
M :: The object model at demand instant t.
M ⁺ :: The expected object model deformation at the next demand instant.
O ⁺ :: The expected obstacle model at the next demand instant
P _c(t):: The set of camera parameters at demand instant t.
S(t):: The surface area of the object at demand instant t.
R(k):: The estimated surface area visibility score for the k^th positional combination.
V :: A matrix of test positional vectors for reconfiguration [n_v × 3].
V ^∗(t + 1):: The expected model visibility at the next demand instant.
V _map :: The normalized visibility proportion mapping matrix [n_p × n_v].
\(\textbf {V}_{\textbf {map}}^{*}\) :: A subset of V_map [n_p × c_n].
X _test :: A matrix of test points from model M⁺ [k × 3]
\(\textbf {X}_{\textbf {test}}^{*}\) :: The set of test points projected onto an arbitrary plane [k × 3].
Y :: A Boolean matrix of test point visibly [k × 4].
a _j :: The area of the j^th polygon in the model at the current demand instant.
c _k :: The number of stereo camera pairs at the current demand instant.
\(c_{k}^{*}\) :: The number of filtered stereo camera pairs at the current demand instant.
c _M :: The center point of the object model [1 × 3].
\(\mathbf {c}_{M}^{+}\) :: The projected center point of the object model [1 × 3].
\(\mathbf {c}_{M}^{*}\) :: A point in space produced by the triangulation of two rays associated with the object’s center point and stereo-pair placement, [1 × 3]
c _n :: The number of stereo camera pairs at the current demand instant
c _{o b s} :: The center point of the obstacle [1 × 3].
d ^d :: A vector of distances along the test position vector from the model center to all test points.
\(d_{thresh}^{d} \) :: The maximum allowable distance for d^d.
d ^w :: A vector of distances between the model center and all test points on a plane.
\(d_{thresh}^{w} \) :: The maximum allowable distance for d^w.
g :: The distance from the object’s bounding rectangle to the image edge in pixels.
h :: The stereo camera pair placement score.
k ^∗ :: An arbitrary row of K^∗ matrix [1 × c_k].
n _p :: The number of model polygons at the current demand instant.
n _v :: The number of test positional vectors.
p ^∗ :: A point on a test positioning vector [1 × 3].
p _c :: The mean stereo camera pair’s position at the current demand instant [1 × 3].
\(\mathbf {p}_{c}^{+} \) :: The mean stereo camera pair’s position at the next demand instant [1 × 3].
\(\mathbf {p}_{b}^{1} ,\mathbf {p}_{b}^{2} \) :: The Bézier-curve control points [1 × 3].
p _max :: The total number of possible positional combinations.
p _{r e d} :: The number of positional combinations used.
q :: The unit vector normal of an arbitrary test point.
v_depth :: A vector of Boolean visibility values for each test point [k × 1].
v_HPR :: A vector of Boolean visibility values for each test point [k × 1].
v_norm :: A vector of Boolean visibility values for each test point [k × 1].
v_test :: A unit test vector from V [1 × 3]
v_width :: A vector of Boolean visibility values for each test point [k × 1].
y :: A vector of logical conjunction of Y across rows [k × 1].
β :: The maximum path motion angle.
𝜃 _min :: The minimum angular separation between two sets of stereo camera pairs.
\(\sigma _{thresh}^{d} \) :: The standard deviation threshold value for depth operator.
\(\sigma _{thresh}^{w} \) :: The standard deviation threshold value for depth operator.
ϕ :: The angular separation between the test positional vector and a test point normal.
ϕ _max :: The maximum angle between a test positional vector and point normal.

References

Song, B., Ding, C., Kamal, A., Farrell, J.A., Roy-Chowdhury, A.K.: Distributed camera networks. IEEE Signal Process. Mag. 28(3), 20–31 (2011)
Article Google Scholar
Piciarelli, C., Esterle, L., Khan, A., Rinner, B., Foresti, G.L.: Dynamic reconfiguration in camera networks: a short survey. IEEE Trans. Circuits Syst. Video Technol. 26(5), 965–977 (2016)
Article Google Scholar
Ilie, A., Welch, G., Macenko, M.: A Stochastic Quality Metric for Optimal Control of Active Camera Network Configurations for 3D Computer Vision Tasks. In: ECCV Workshop on Multicamera and Multimodal Sensor Fusion Algorithms and Applications, pp 1–12 (2008)
Cowan, C.K.: Model-Based Synthesis of Sensor Location. In: IEEE International Conference on Robotics and Automation (ICRA), pp 900–905 (1988)
Tarabanis, K.A., Tsai, R.Y., Abrams, S.: Planning Viewpoints that Simultaneously Satisfy Several Feature Detectability Constraints for Robotic Vision. In: International Conference on Advanced Robotics “Robots in Unstructured Environments”, vol. 2, pp 1410–1415 (1991)
Qureshi, F.Z., Terzopoulos, D.: Surveillance in Virtual Reality: System Design and Multi-Camera Control. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp 1–8 (2007)
Piciarelli, C., Micheloni, C., Foresti, G.L.: PTZ Camera Network Reconfiguration. In: International Conference on Distributed Smart Cameras (ICDSC), pp 1–7 (2009)
Collins, R.T., Amidi, O., Kanade, T.: An Active Camera System for Acquiring Multi-View Video. In: International Conference on Image Processing, pp 1–4 (2002)
Chen, S.Y., Li, Y.F.: Vision sensor planning for 3-d model acquisition. IEEE Trans. Syst. Man, Cybern. Part B Cybern. 35(5), 894–904 (2005)
Article MathSciNet Google Scholar
Schramm, F., Geffard, F., Morel, G., Micaelli, A.: Calibration Free Image Point Path Planning Simultaneously Ensuring Visibility and Controlling Camera Path. In: IEEE International Conference on Robotics and Automation (ICRA), pp 2074–2079 (2007)
Amamra, A., Amara, Y., Benaissa, R., Merabti, B.: Optimal Camera Path Planning for 3D Visualisation. In: SAI Computing Conference, pp 388–393 (2016)
Mir-Nasiri, N.: Camera-Based 3D Object Tracking and Following Mobile Robot. In: IEEE Conference on Robotics, Automation and Mechatronics, pp 1–6 (2006)
Abrams, S., Allen, P.K., Tarabanis, K.A.: Dynamic Sensor Planning. In: IEEE International Conference on Robotics and Automation (ICRA), vol. 2, pp 605–610 (1993)
Tarabanis, K.A., Tsai, R.Y., Allen, P.K.: The MVP sensor planning system for robotic vision tasks. IEEE Trans. Robot. Autom. 11(1), 72–85 (1995)
Article Google Scholar
Christie, M., Machap, R., Normand, J.-M., Olivier, P., Pickering, J.: Virtual Camera Planning: a Survey. In: International Symposium on Smart Graphics, pp 40–52 (2005)
Chapter Google Scholar
Bakhtari, A., Naish, M.D., Eskandari, M., Croft, E.A., Benhabib, B.: Active-vision-based multisensor surveillance-an implementation. IEEE Trans. Syst. Man, Cybern. Part C Appl. Rev. 36(5), 668–680 (2006)
Article Google Scholar
MacKay, M.D., Fenton, R.G., Benhabib, B.: Pipeline-architecture based real-time active-vision for human-action recognition. J. Intell. Robot. Syst. 72(3–4), 385–407 (2013)
Article Google Scholar
Schacter, D.S., Donnici, M., Nuger, E., MacKay, M.D., Benhabib, B.: A multi-camera active-vision system for deformable-object-motion capture. J. Intell. Robot. Syst. 75(3), 413–441 (2014)
Article Google Scholar
Herrera, J.L. A., Chen, X.: Consensus Algorithms in a Multi-Agent Framework to Solve PTZ Camera Reconfiguration in UAVs. In: International Conference on Intelligent Robotics and Applications, pp 331–340 (2012)
Konda, K.R., Conci, N.: Real-Time Reconfiguration of PTZ Camera Networks Using Motion Field Entropy and Visual Coverage. In: Proceedings of the International Conference on Distributed Smart Cameras, p 18 (2014)
Natarajan, P., Hoang, T.N., Low, K.H., Kankanhalli, M., Hoang, T.N., Low, K.H.: Decision-Theoretic Coordination and Control for Active Multi-Camera Surveillance in Uncertain, Partially Observable Environments. In: International Conference on Distributed Smart Cameras (ICDSC), pp 1–6 (2012)
Song, B., Soto, C., Roy-Chowdhury, A.K., Farrell, J.A.: Decentralized Camera Network Control Using Game Theory. In: 2008 Second ACM/IEEE International Conference on Distributed Smart Cameras, pp 1–8 (2008)
Ding, C., Song, B., Morye, A., Farrell, J.A., Roy-Chowdhury, A.K.: Collaborative sensing in a distributed PTZ camera network. IEEE Trans. Image Process. 21(7), 3282–3295 (2012)
Article MathSciNet MATH Google Scholar
Del Bimbo, A., Dini, F., Lisanti, G., Pernici, F.: Exploiting distinctive visual landmark maps in pan-tit-zoom camera networks. Comput. Vis. Image Underst. 114(6), 611–623 (2010)
Article Google Scholar
Piciarelli, C., Micheloni, C., Foresti, G.L.: Occlusion-Aware Multiple Camera Reconfiguration. In: International Conference on Distributed Smart Cameras (ICDSC), p 88 (2010)
Indu, S., Chaudhury, S., Mittal, N.R., Bhattacharyya, A.: Optimal Sensor Placement for Surveillance of Large Spaces. In: International Conference on Distributed Smart Cameras (ICDSC), pp 1–8 (2009)
Schwager, M., Julian, B.J., Angermann, M., Rus, D.: Eyes in the sky: decentralized control for the deployment of robotic camera networks. Proc. IEEE 99(9), 1541–1561 (2011)
Article Google Scholar
Konda, K.R., Rosani, A., Conci, N., De Natale, F.G.B.: Smart Camera Reconfiguration in Assisted Home Environments for Elderly Care. In: European Conference on Computer Vision (ECCV), pp 45–58 (2014)
Google Scholar
Tarabanis, K.A., Allen, P.K., Tsai, R.Y.: A survey of sensor planning in computer vision. IEEE Trans. Robot. Autom. 11(1), 86–104 (1995)
Article Google Scholar
Pito, R.: A solution to the next best view problem for automated surface acquisition. IEEE Trans. Pattern Anal. Mach. Intell. 21(10), 1016–1030 (1999)
Article Google Scholar
Wong, L.M., Dumont, C., Abidi, M.A.: Next Best View System in a 3D Object Modeling Task. In: Proceedings 1999 IEEE International Symposium on Computational Intelligence in Robotics and Automation. CIRA’99 (Cat. No.99EX375), pp 306–311 (1999)
Bircher, A., Kamel, M., Alexis, K., Oleynikova, H., Siegwart, R.: Receding Horizon ‘Next-Best-View’ Planner for 3D Exploration. In: 2016 IEEE International Conference on Robotics and Automation (ICRA), pp 1462–1468 (2016)
Liska, C., Sablatnig, R.: Adaptive 3D Acquisition Using Laser Light. In: Czech Pattern Recognition Workshop, pp 111–117 (2000)
Chan, M.-Y., Mak, W.-H., Qu, H.: An Efficient Quality-Based Camera Path Planning Method for Volume Exploration. In: International Symposium on Visual Computing, pp 12–21 (2008)
Chapter Google Scholar
Benhamou, F., Goualard, F., Languénou, É., Christie, M.: Interval constraint solving for camera control and motion planning. ACM Trans. Comput. Log. 5(4), 732–767 (2004)
Article MathSciNet MATH Google Scholar
Assa, J., Wolf, L., Cohen-Or, D.: The virtual director: a correlation-based online viewing of human motion. Comput. Graph. Forum 29(2), 595–604 (2010)
Article Google Scholar
Naish, M.D., Croft, E.A., Benhabib, B.: Simulation-Based Sensing-System Configuration for Dynamic Dispatching. In: IEEE International Conference on Systems, Man and Cybernetics, vol. 5, pp 2964–2969 (2001)
Naish, M.D., Croft, E.A., Benhabib, B.: Coordinated dispatching of proximity sensors for the surveillance of manoeuvring targets. Robot. Comput. Integr. Manuf. 19(3), 283–299 (2003)
Article Google Scholar
Bakhtari, A., Benhabib, B.: An active vision system for multitarget surveillance in dynamic environments. IEEE Trans. Syst. Man, Cybern. Part B Cybern. 37(1), 190–198 (2007)
Article Google Scholar
Bakhtari, A., MacKay, M.D., Benhabib, B.: Active-vision for the autonomous surveillance of dynamic, multi-object environments. J. Intell. Robot. Syst. 54(4), 567–593 (2009)
Article MATH Google Scholar
Tan, J.K., Ishikawa, S., Yamaguchi, I., Naito, T.: Yokota, m.: 3-D recovery of human motion by mobile stereo cameras. Artif. Life Robot. 10(1), 64–68 (2006)
Article Google Scholar
Malik, R., Malik, R., Bajcsy, P., Bajcsy, P.: Automated Placement of Multiple Stereo Cameras. In: ECCV Workshop on Omnidirectional Vision, Camera Networks and Non-Classical Cameras (2008)
Hasler, N., Rosenhahn, B., Thormählen, T., Wand, M., Gall, J., Seidel, H.P.: Markerless Motion Capture with Unsynchronized Moving Cameras. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp 224–231 (2009)
MacKay, M.D., Fenton, R.G., Benhabib, B.: Time-varying-geometry object surveillance using a multi-camera active-vision system. Int. J. Smart Sens. Intell. Syst. 1(3), 679–704 (2008)
Google Scholar
MacKay, M.D., Fenton, R.G., Benhabib, B.: Multi-camera active surveillance of an articulated human form - an implementation strategy. Comput. Vis. Image Underst. 115(10), 1395–1413 (2011)
Article Google Scholar
Hofmann, M., Gavrila, D.M.: Multi-view 3D human pose estimation in complex environment. Int. J. Comput. Vis. 96(1), 103–124 (2012)
Article MathSciNet Google Scholar
Schacter, D.S.: Multi-Camera Active-Vision System reconfiguration for deformable object motion capture. University of toronto (2014)
Zhao, W., Gao, S., Lin, H.: A robust hole-filling algorithm for triangular mesh. Vis. Comput. 23(12), 987–997 (2007)
Article Google Scholar
Hilton, A., Stoddart, A.J., Illingworth, J., Windeatt, T.: Reliable surface reconstruction from multiple range images, pp. 117–126 (1996)
Davis, J., Marschner, S.R., Garr, M., Levoy, M.: Filling Holes in Complex Surfaces Using Volumetric Diffusion. In: Proceedings. First International Symposium on 3D Data Processing Visualization and Transmission, pp 428–861 (2002)
Kalal, Z., Matas, J., Mikolajczyk, K.: Online Learning of Robust Object Detectors during Unstable Tracking. In: 2009 IEEE 12Th Int. Conf. Comput. Vis. Work. ICCV Work. 2009, pp 1417–1424 (2009)
Forsyth, D.A., Ponce, J.: Computer Vision: a Modern Approach, 2Nd Edn. Pearson, London (2012)
Google Scholar
Hughes, J.F. et al.: Computer Graphics: Principles and Practice, 3Rd Edn. Addison-Wesley Professional, Boston (2013)
Google Scholar
Laurentini, A.: Visual hull concept for silhouette-based image understanding. IEEE Trans. Pattern Anal. Mach. Intell. 16(2), 150–162 (1994)
Article Google Scholar
Terauchi, T., Oue, Y., Fujimura, K.: A Flexible 3D Modeling System Based on Combining Shape-From-Silhouette with Light-Sectioning Algorithm. In: International Conference on 3-D Digital Imaging and Modeling, pp 196–203 (2005)
Hernández Esteban, C., Schmitt, F.: Silhouette and stereo fusion for 3d object modeling. Comput. Vis. Image Underst. 96(3), 367–392 (2004)
Article Google Scholar
Cremers, D., Kolev, K.: Multiview stereo and silhouette consistency via convex functionals over convex domains. IEEE Trans. Pattern Anal. Mach. Intell. 33(6), 1161–1174 (2011)
Article Google Scholar
Liu, Y., Dai, Q., Xu, W.: A point-cloud-based multiview stereo algorithm for free-viewpoint video. IEEE Trans. Vis. Comput. Graph. 16(3), 407–418 (2010)
Article Google Scholar
Hebert, P. et al.: Combined Shape, Appearance and Silhouette for Simultaneous Manipulator and Object Tracking. In: International Conference on Robotics and Automation, pp 2405–2412 (2012)
Song, P., Wu, X., Wang, M.Y.: Volumetric stereo and silhouette fusion for image-based modeling. Vis. Comput. 26(12), 1435–1450 (2010)
Article Google Scholar
Nuger, E., Benhabib, B.: Multicamera fusion for shape estimation and visibility analysis of unknown deforming objects. J. Electron. Imaging 25(4), 41009 (2016)
Article Google Scholar
Huang, C.-H., Cagniart, C., Boyer, E., Ilic, S.: A bayesian approach to multi-view 4D modeling. Int. J. Comput. Vis. 116(2), 115–135 (2016)
Article MathSciNet Google Scholar
Matusik, W., Buehler, C., Raskar, R., Gortler, S.J., McMillan, L.: Image-Based Visual Hulls. In: Proceedings of the Annual Conference on Computer Graphics and Interactive Techniques, SIGGRAPH, pp 369–374 (2000)
Corazza, S., Mündermann, L., Gambaretto, E., Ferrigno, G., Andriacchi, T.P.: Markerless motion capture through visual hull, articulated ICP and subject specific model generation. Int. J. Comput. Vis. 87(1–2), 156–169 (2010)
Article Google Scholar
Li, Q., Xu, S., Xia, D., Li, D.: A Novel 3D Convex Surface Reconstruction Method Based on Visual Hull. In: Pattern Recognition and Computer Vision, vol. 8004, p 800412 (2011)
Roshnara Nasrin, P.P., Jabbar, S.: Efficient 3D Visual Hull Reconstruction Based on Marching Cube Algorithm. In: International Conference on Innovations in Information, Embedded and Communication Systems (ICIIECS), pp 1–6 (2015)
Mercier, B., Meneveaux, D., Fournier, A.: A framework for automatically recovering object shape, reflectance and light sources from calibrated images. Int. J. Comput. Vis. 73(1), 77–93 (2007)
Article Google Scholar
Lorensen, W.E., Cline, H.E.: Marching Cubes: a High Resolution 3D Surface Construction Algorithm. In: Proceedings of the 14Th Annual Conference on Computer Graphics and Interactive Techniques (SIGGRAPH), vol. 21, no. 4, pp 163–169 (1987)
Lazebnik, S., Furukawa, Y., Ponce, J.: Projective visual hulls. Int. J. Comput. Vis. 74(2), 137–165 (2007)
Article Google Scholar
Tomasi, C., Kanade, T.: Shape and motion from image streams: a factorization method. Proc. Natl. Acad. Sci. 90(21), 9795–9802 (1993)
Article Google Scholar
Pollefeys, M., Vergauwen, M., Cornelis, K., Tops, J., Verbiest, F., Van Gool, L.: Structure and Motion from Image Sequences. In: Proceedings of the Conference on Optical 3D Measurement Techniques, pp 251–258 (2001)
Lhuillier, M., Quan, L.: A quasi-dense approach to surface reconstruction from uncalibrated images. IEEE Trans. Pattern Anal. Mach. Intell. 27(3), 418–433 (2005)
Article Google Scholar
Snavely, N., Seitz, S.M., Szeliski, R.: Photo tourism: exploring photo collections in 3d. ACM Trans. Graph. 25(3), 835–846 (2006)
Article Google Scholar
Del Bue, A., Agapito, L.: Non-rigid stereo factorization. Int. J. Comput. Vis. 66(2), 193–207 (2006)
Article Google Scholar
Huang, Y., Tu, J., Huang, T.S.: A Factorization Method in Stereo Motion for Non-Rigid Objects. In: IEEE International Conference on Acoustics, Speech and Signal Processing, No. 1, pp 1065–1068 (2008)
Bay, H., Ess, A., Tuytelaars, T., Van Gool, L.: Speeded up robust features (SURF). Comput. Vis. Image Underst. 110(3), 346–359 (2008)
Article Google Scholar
Lowe, D.G.: Distinctive image features from scale-invariant keypoints. Int. J. Comput. Vis. 60(2), 91–110 (2004)
Article MathSciNet Google Scholar
Furukawa, Y., Ponce, J.: Accurate, dense, and robust multiview stereopsis. IEEE Trans. Pattern Anal. Mach. Intell. 32(8), 1362–1376 (2010)
Article Google Scholar
Kalman, R.E.: A new approach to linear filtering and prediction problems 1. ASME Trans. J. Basic Eng. 82 (Series D), 35–45 (1960)
Article Google Scholar
Welch, G., Bishop, G.: An introduction to the Kalman filter. In Pract. 7(1), 1–16 (2006)
Google Scholar
Ristic, B., Arulampalam, S., Gordon, N.: A Tutorial on Particle Filters. In: Beyond the Kalman Filter: Particle Filter for Tracking Applications, pp 35–62. Artech House, Boston (2004)
Sui, Y., Zhang, L.: Robust tracking via locally structured representation. Int. J. Comput. Vis. 119(2), 110–144 (2016)
Article MathSciNet Google Scholar
Gonzales, C., Dubuisson, S.: Combinatorial resampling particle filter: an effective and efficient method for articulated object tracking. Int. J. Comput. Vis. 112(3), 255–284 (2015)
Article Google Scholar
Kwolek, B., Krzeszowski, T., Gagalowicz, A., Wojciechowski, K., Josinski, H.: Real-Time Multi-View Human Motion Tracking Using Particle Swarm Optimization with Resampling. In: International Conference on Articulated Motion and Deformable Objects (AMDO), pp 92–101 (2012)
Chapter Google Scholar
Zhang, X., Hu, W., Xie, N., Bao, H., Maybank, S.: A robust tracking system for low frame rate video. Int. J. Comput. Vis. 115(3), 279–304 (2015)
Article MathSciNet Google Scholar
Maung, T.H.H.: Real-time hand tracking and gesture recognition system using neural networks. World Acad. Sci. Eng. Technol. 50, 466–470 (2009)
Google Scholar
Agarwal, A., Datla, S., Tyagi, B., Niyogi, R.: Novel design for real time path tracking with computer vision using neural networks. Int. J. Comput. Vis. Robot. 1(4), 380–391 (2010)
Article Google Scholar
Katz, S., Tal, A., Basri, R.: Direct visibility of point sets. ACM Trans. Graph. 26(3), 1–12 (2007)
Article Google Scholar
Möller, T., Trumbore, B.: Fast, minimum storage ray-triangle intersection. J. Graph. Tools 2(1), 21–28 (1997)
Article Google Scholar
Kim, W.S., Ansar, A.I., Steele, R.D., Steinke, R.C.: Performance Analysis and Validation of a Stereo Vision System. In: IEEE International Conference on Systems, Man and Cybernetics, vol. 2, pp 1409–1416 (2005)
Blender Online Community: Blender - a 3D modelling and rendering package. Blender Institute, Amsterdam (2016)
Vedaldi, A., Fulkerson, B.: {VLFeat}- an Open and Portable Library of Computer Vision Algorithms. In: ACM International Conference on Multimedia (2010)
Bouguet, J.-Y.: Camera calibration toolbox for matlab (2004)
MacKay, M.D., Fenton, R.G., Benhabib, B.: Active Vision for Human Action Sensing. In: Technological Developments in Education and Automation, pp 397–402. Springer (2010)

Download references

Acknowledgements

The authors would like to acknowledge the support received, in part, by the Natural Sciences and Engineering Research Council of Canada (NSERC).

Author information

Authors and Affiliations

University of Toronto, 5 King’s College Road, Toronto, ON, M5S 3G8, Canada
Evgeny Nuger & Beno Benhabib

Authors

Evgeny Nuger
View author publications
You can also search for this author in PubMed Google Scholar
Beno Benhabib
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Evgeny Nuger.

Appendices

Appendix A

This appendix briefly outlines the model generation and deformation estimation methods. The nature of the problem requires the model generation method to yield the most accurate and complete model of the target object without a priori knowledge of the object’s identity. A generic approach to the problem may include cases where cameras are poorly positioned around the target object, thus, triangulation methods recovering only incomplete surface patches while a visual hull would over-estimate the bounding volume. In contrast, a fusion technique that combines triangulation with a visual hull would produce a model with highly accurate triangulated surface patches and a complete estimate of object’s volume.

Model generation for a priori unknown objects implies that an accurate measurement between the recovered shape and the true model is not possible during run-time from the system’s perspective. The model’s accuracy could be measured for test objects whose model is known to the user, but during the system’s run-time, a ground truth model is unavailable. Therefore, the model generation method must inherently attempt to maximize the recovery accuracy and completeness through the implemented shape recovery technique. The deformation estimation method receives the recovered, solid object model and current camera parameters from the model generation method and applies an adaptive particle filtering algorithm to estimate the deformation.

The model generation method comprises three steps: surface-patch triangulation through stereo-camera pairs, visual hull carving, and fusion. The surface patch triangulation yields a set of surface patches that represent stereo-visible regions of the target object based on the stereo-cameras position relative to the object. Stereo-triangulation is the most accurate approach to recover surface information for a priori unknown objects for a multi-camera system.

The model generation and deformation estimation methods were validated in our work through multiple simulated experiments and compared to existing methods in the literature. The methods were compared to the fusion algorithm developed by Li et al. [65], wherein their reconstruction method was shown to recover both concave and convex geometries. In order to provide a comparative analysis between the model generation method proposed herein and that of Li et al., multiple simulated experiments are presented below for the model generation and deformation estimation of multiple a priori unknown objects deforming in the workspace. The comparison also varies the number of cameras available in the system to illustrate a camera-saturated workspace approach, otherwise known as a brute-force solution to the camera placement problem. It is noted that the brute-force approach of increasing the number of cameras in the system is both computationally expensive and results in a much more constricted workspace due to the presence of the extra cameras.

The comparison of the methods presented herein includes three unique object deformations with five camera configurations. The initial object model and final deformation for Simulation A are presented in Fig. 20, Simulation B in Fig. 21, and Simulation C in Fig. 22. The camera configurations were numbered I to VI and are presented in Figs. 23, 24 and 25, respectively. The methods were compared based on the error between the generated model’s total surface area versus the ground truth model, and the error between their volumes. The comparative results are presented in Table 3. One major difference must be noted, namely, the errors calculated for the method developed by Li et al. were for the model recovered at the current demand instant, while the errors for chosen model generation method were based on the estimated deformation for the next demand instant. The comparisons illustrate the accuracy of the chosen model generation and deformation estimation methods when compared to other methods available in the literature.

Table 3 Comparative results, normalized errors [61]

Full size table

A final comparison is presented in Table 4, wherein the chosen model generation and deformation estimation method was compared to itself with the expectation that the target object was a priori known and assumed to have a fixed surface area. Overall, the known model method showed less error except for the case where the object’s surface area changed in Simulation A.

Table 4 Comparative results with known model [61]

Full size table

Appendix: B

The figures illustrate the visibility results for 12 supplementary simulations. The simulations were run using the same deforming object as in Section 4. However, for these simulations, the camera models were based on the calibrated camera parameters used in the experiment, namely, a 18 mm focal length, 4:3 aspect ratio sensor, and lens distortion. To set up these simulations, the camera and obstacle positions were scaled in the workspace. The results followed the same trend as the simulations in Section 4, but produced slightly lower visibility metric with increased variance. This disparity can be attributed to the added lens distortions and the impact of the foreshortening effect of the short focal length lens.

Appendix: C

The following figures illustrate the (bird-eye-view) deformation sequences and subsequent camera reconfigurations at select demand instants for the simulations described in Section 4 above. The estimated visible polygons of the object model are colored green to illustrate the recovered stereo-surface area of the target object.

Figures 28, and 29 illustrate every odd frame of the first simulation set for the static camera placement.

Figures 30, and 31 illustrate every odd frame of the first simulation set for the reconfiguration camera placement through the proposed method.

Figures 32 and 33 illustrate every odd frame of the first simulation set for the ideal reconfiguration camera placement through the proposed method.

Tables 5 and 6 represent the x and y global positions of all six cameras in the first simulation relative world reference frame upon which the target object was centered.

Table 5 Camera X,Y positions for reconfiguration Simulation 1

Full size table

Table 6 Camera X,Y positions for ideal reconfiguration Simulation 1

Full size table

Appendix: D

The following figures illustrate the (bird-eye-view) deformation sequences and subsequent camera reconfigurations at select demand instants for the experiments described in Section 5 above. The estimated visible polygons of the object model are colored green to illustrate the recovered stereo-surface area of the target object. The remainder of the recovered model that was not estimated as visible is colored red.

Figures 34, 35, 36, 37 and 38 illustrate every odd frame of experiments 1–5, and include the comparison of the reconfiguration through the proposed method to the performance of the static method.

Table 7 represents the x and y global positions of all six cameras in the fifth experiment relative world reference frame upon which the target object was centered.

Table 7 Camera X,Y positions for Reconfiguration Experiment 5

Full size table

Rights and permissions

Reprints and permissions

About this article

Cite this article

Nuger, E., Benhabib, B. Multi-Camera Active-Vision for Markerless Shape Recovery of Unknown Deforming Objects. J Intell Robot Syst 92, 223–264 (2018). https://doi.org/10.1007/s10846-018-0773-0

Download citation

Received: 16 January 2017
Accepted: 05 January 2018
Published: 09 February 2018
Issue Date: October 2018
DOI: https://doi.org/10.1007/s10846-018-0773-0

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Multi-Camera Active-Vision for Markerless Shape Recovery of Unknown Deforming Objects

Abstract

Access this article

Similar content being viewed by others

A Bayesian Approach to Multi-view 4D Modeling

Improved Structure from Motion Using Fiducial Marker Matching

Real-Time Large-Scale Dense 3D Reconstruction with Loop Closure

Abbreviations

References

Acknowledgements