Abstract
We present a novel interactive framework for improving 3D reconstruction starting from incomplete or noisy results obtained through image-based reconstruction algorithms. The core idea is to enable the user to provide localized hints on the curvature of the surface, which are turned into constraints during an energy minimization reconstruction. To make this task simple, we propose two algorithms. The first is a multi-view segmentation algorithm that allows the user to propagate the foreground selection of one or more images both to all the images of the input set and to the 3D points, to accurately select the part of the scene to be reconstructed. The second is a fast GPU-based algorithm for the reconstruction of smooth surfaces from multiple views, which incorporates the hints provided by the user. We show that our framework can turn a poor-quality reconstruction produced with state of the art image-based reconstruction methods into a high- quality one.
Similar content being viewed by others
References
Adarsh Kowdle, S.N.S., Szeliski, R.: Multiple view object cosegmentation using appearance and stereo cues. In: European Conference on Computer Vision (ECCV 2012) (2012)
Alexe, B., Deselaers, T., Ferrari, V.: Classcut for unsupervised class segmentation. In: ECCV2010, pp. 8–10 (2010). http://www.springerlink.com/index/D62206186631X328.pdf
Bao, S.Y., Chandraker, M., Lin, Y., Savarese, S.: Dense object reconstruction with semantic priors. In: CVPR, pp. 1264–1271 (2013). doi:10.1109/CVPR.2013.167. http://www.cv-foundation.org/openaccess/CVPR2013.py
Barzilai, J., Borwein, J.M.: Two-point step size gradient methods. IMA J. Numer. Anal. 8(1), 141–148 (1988). doi:10.1093/imanum/8.1.141. http://imajna.oxfordjournals.org/content/8/1/141.abstract
Bleyer, M., Rother, C., Kohli, P., Scharstein, D., Sinha, S.: Object stereo–joint stereo matching and object segmentation. In: Proceedings of the 2011 IEEE Conference on Computer Vision and Pattern Recognition, CVPR ’11, pp. 3081–3088. IEEE Computer Society, Washington (2011). doi:10.1109/CVPR.2011.5995581
Boykov, Y., Jolly, M.: Interactive graph cuts for optimal boundary & region segmentation of objects in ND images. In: ICCV 2001, pp. 105–112 (2001). http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=937505
Bradley, D., Boubekeur, T., Heidrich, W.: Accurate multi-view reconstruction using robust binocular stereo and surface meshing. In: CVPR. IEEE Computer Society (2008). doi:10.1109/CVPR.2008.4587792
Briggs, W.L., Henson, V.E., McCormick, S.F.: A Multigrid Tutorial, 2nd edn. Society for Industrial and Applied Mathematics, Philadelphia (2000)
Campbell, N., Vogiatzis, G., Hernandez, C., Cipolla, R.: Automatic object segmentation from calibrated images. In: Visual Media Production Conference, pp. 126–137 (2011). doi:10.1109/CVMP.2011.21
Campbell, N.D., Vogiatzis, G., Hernández, C., Cipolla, R.: Using multiple hypotheses to improve depth-maps for multi-view stereo. In: Proceedings of the 10th European Conference on Computer Vision: Part I, ECCV ’08, pp. 766–779. Springer, Berlin (2008). doi:10.1007/978-3-540-88682-2_58
Curless, B., Levoy, M.: A volumetric method for building complex models from range images. In: Proceedings of the 23rd Annual Conference on Computer Graphics and Interactive Techniques, SIGGRAPH ’96, pp. 303–312. ACM, New York (1996). doi:10.1145/237170.237269
Djelouah, A., Franco, J.S., Boyer, E., Clerc, F.L., Prez, P.: N-tuple color segmentation for multi-view silhouette extraction. In: ECCV (5) ’12, pp. 818–831 (2012)
Djelouah, A., Franco, J.S., Boyer, E., Le Clerc, F., Perez, P.: Multi-view object segmentation in space and time. In: 2013 IEEE International Conference on Computer Vision (ICCV), pp. 2640–2647 (2013). doi:10.1109/ICCV.2013.328
Freedman, D.: Interactive graph cut based segmentation with shape priors. In: CVPR’05, vol. 1, pp. 755–762 (2005). doi:10.1109/CVPR.2005.191
Furukawa, Y., Curless, B., Seitz, S., Szeliski, R.: Manhattan-world stereo. In: CVPR 2009, 1422–1429 (2009). doi:10.1109/CVPR.2009.5206867. http://ieeexplore.ieee.org/lpdocs/epic03/wrapper.htm?arnumber=5206867
Furukawa, Y., Curless, B., Seitz, S.M., Szeliski, R.: Towards internet-scale multi-view stereo. In: 2010 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2010)
Furukawa, Y., Ponce, J.: Accurate, dense, and robust multiview stereopsis. IEEE Trans. Pattern Anal. Mach. Intell. 32(8), 1362–1376 (2010). doi:10.1109/TPAMI.2009.161
Gallup, D., Frahm, J., Pollefeys, M.: Piecewise planar and non-planar stereo for urban scene reconstruction. In: CVPR’10, pp. 1418–1425 (2010). http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=5539804
Goesele, M., Curless, B., Seitz, S.M.: Multi-view stereo revisited. In: Proceedings of the 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, vol. 2, CVPR ’06, pp. 2402–2409. IEEE Computer Society, Washington (2006). doi:10.1109/CVPR.2006.199
Gopi, M., Krishnan, S., Silva, C.: Surface reconstruction based on lower dimensional localized delaunay triangulation. Comput. Graph. Forum 19(3), 467–478 (2000). doi:10.1111/1467-8659.00439
Hoff III, K.E., Keyser, J., Lin, M., Manocha, D., Culver, T.: Fast computation of generalized voronoi diagrams using graphics hardware. In: Proceedings of the 26th Annual Conference on Computer Graphics and Interactive Techniques, SIGGRAPH ’99, pp. 277–286. ACM Press/Addison-Wesley, New York (1999). doi:10.1145/311535.311567
Jancosek, M., Pajdla, T.: Segmentation based multi-view stereo. In: Computer Vision Winter Workshop (2009)
Kass, M., Witkin, A., Terzopoulos, D.: Snakes: active contour models. IJCV 1(4), 321–331 (1988)
Kazhdan, M.M., Bolitho, M., Hoppe, H.: Poisson surface reconstruction. In: Symposium on Geometry Processing, pp. 61–70 (2006)
Kolev, K., Brox, T., Cremers, D.: Robust variational segmentation of 3D objects from multiple views. Pattern Recognit. 688–697 (2006). http://www.springerlink.com/index/m68268261t8h0641.pdf
Kolev, K., Klodt, M., Brox, T., Cremers, D.: Propagated photoconsistency and convexity in variational multiview 3D reconstruction. In: Workshop on Photometric Analysis for Computer Vision, Rio de Janeiro (2007)
Kolev, K., Pock, T., Cremers, D.: Anisotropic minimal surfaces integrating photoconsistency and normal information for multiview stereo. In: ECCV’10, Heraklion (2010)
Kolmogorov, V., Zabih, R.: What energy functions can be minimized via graph cuts? IEEE Trans. Pattern Anal. Mach. Intell. 26(2), 147–59 (2004). doi:10.1109/TPAMI.2004.1262177. http://www.ncbi.nlm.nih.gov/pubmed/15376891
Mortensen, E., Barrett, W.: Intelligent scissors for image composition. In: Proceedings of the 22nd Annual Conference, vol. 84602(801) (1995). http://dl.acm.org/citation.cfm?id=218442
Nan, L., Sharf, A., Chen, B.: 2D–3D lifting for shape reconstruction. Comput. Graph. Forum 33(7), 249–258 (2014). doi:10.1111/cgf.12493
Nguyen, H., Wnsche, B., Delmas, P., Lutteroth, C., Zhang, E.: A robust hybrid image-based modeling system. Vis. Comput. 1–16 (2015). doi:10.1007/s00371-015-1078-y
Öztireli, A.C., Guennebaud, G., Gross, M.: Feature preserving point set surfaces based on non-linear kernel regression. Comput. Graph. Forum 28(2), 493–501 (2009)
Rother, C., Kolmogorov, V.: Grabcut: interactive foreground extraction using iterated graph cuts. ACM Trans. Graph. (2004). http://dl.acm.org/citation.cfm?id=1015720
Seitz, S., Curless, B., Diebel, J., Scharstein, D., Szeliski, R.: A comparison and evaluation of multi-view stereo reconstruction algorithms. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2006), vol. 1, pp. 519–528 (2006). doi:10.1109/CVPR.2006.19
Seitz, S.M., Dyer, C.R.: Photorealistic scene reconstruction by voxel coloring. In: Proceedings of the Computer Vision and Pattern Recognition Conference, pp. 1067–1073 (1997)
Sinha, S., Steedly, D., Szeliski, R.: Piecewise planar stereo for image-based rendering. In: ICCV, pp. 1881–1888 (2009). doi:10.1109/ICCV.2009.5459417
Snavely, N., Seitz, S.M., Szeliski, R.: Photo tourism: exploring photo collections in 3D. ACM Trans. Graph. 25(3), 835–846 (2006). doi:10.1145/1141911.1141964
Sormann, M., Zach, C., Karner, K.: Graph cut based multiple view segmentation for 3D reconstruction. In: Third International Symposium on 3D Data Processing, Visualization, and Transmission, pp. 1085–1092 (2006). doi:10.1109/3DPVT.2006.70
Sormann, M., Zach, C., Karner, K.: Graph cut based multiple view segmentation for 3D reconstruction. In: Third International Symposium on 3D Data Processing, Visualization, and Transmission (3DPVT ’06), pp. 1085–1092 (2006). doi:10.1109/3DPVT.2006.70. http://ieeexplore.ieee.org/lpdocs/epic03/wrapper.htm?arnumber=4155842
Vicente, S.: Graph cut based image segmentation with connectivity priors. In: CVPR ’08, pp. 1–8 (2008). doi:10.1109/CVPR.2008.4587440
Wahba, G.: Spline Models for Observational Data. SIAM, Philadelphia (1990)
Wu, C., Agarwal, S., Curless, B., Seitz, S.M.: Schematic surface reconstruction. In: CVPR, 2012, 1498–1505 (2012). doi:10.1109/CVPR.2012.6247839
Yezzi, A., Soatto, S.: Stereoscopic segmentation. IJCV 53(1), 31–43 (2003). http://www.springerlink.com/index/V812463066072825.pdf
Zhu, C., Leow, W.: Textured mesh surface reconstruction of large buildings with multi-view stereo. Vis. Comput. 29(6–8), 609–615 (2013). doi:10.1007/s00371-013-0827-z
Chaurasia, G., Duchêne, S., Sorkine-Hornung, O., Drettakis, G.: Depth Synthesis and Local Warps for Plausible Image-based Navigation. ACM Trans. Graph. 32 (2013). http://www-sop.inria.fr/reves/Basilic/2013/CDSD13
Acknowledgments
The research leading to these results was funded by EU FP7 project ICT FET Harvest4D (http://www.harvest4d.org/, G.A. no. 323567). The Museum Dataset is courtesy of Chaurasia et al. [45].
Author information
Authors and Affiliations
Corresponding author
Electronic supplementary material
Below is the link to the electronic supplementary material.
Supplementary material 1 (mp4 28128 KB)
Appendices
Appendix A: Algebraic derivations of the gradient of the energy terms
1.1 Smoothness term
The derivative of the smoothness term with respect to \(z_{m,n}\) is simply:
The unrolled formula, using central finite differences, is:
which finally gives:
1.2 Coherence term
So we need the derivatives of \(g_k(i,j,z)\) and \(f_k(i,j,z)\).
Since the cameras are calibrated, we know the matrix \(\mathbf {R}_{h,k}\) that transforms the depth values from camera k to camera h:
where \(\mathbf {I}\) and \(\mathbf {E}\) are the intrinsic and extrinsic matrices of camera h and k.
\(g_k(i,j,z)\) is defined as:
where \(\mathbf {s}_{w}\) is the row vector that selects component w, (i.e., \(\mathbf {s}_{w}=[\begin{array}{cccc} 0&0&0&1\end{array}]\)) and the \(\mathbf {r}_{ij}\) are the components of the \(\mathbf {R}\) matrix. The derivative of \(g_k\) is then:
It is no surprise that the derivative does not depend on z, because function \(g_k(i,j,z)\) simply returns the distance between a point along a line and a plane, which varies linearly.
For function \(f_k(i,j,z)\), things are a little harder, because it describes the depth map \(z_k\) along the projection of the line on the image plane of camera k. Let us define the parametric function describing such a projection:
Function f is then the composition of \(z(x,y):{\mathbb {{R}}}^{2}\rightarrow {\mathbb {{R}}}\) with \(\mathbf {u}\), i.e., \((z_k\cdot \mathbf {u}):{\mathbb {{R}}}\rightarrow {\mathbb {{R}}}\). Therefore, the derivative of the composition is
We still need to define what \(\frac{\mathrm{d}}{\mathrm{d}z}u_{x}(z)\) is (and, by symmetry, this will also yield its y-axis counterpart). This is the derivative
of a function with the form \(\frac{\alpha z+\beta }{\gamma z+\delta }\) whose derivative is \(\frac{\alpha \delta -\beta \gamma }{(\gamma z+\delta )^{2}}\). Therefore,
where \(\alpha _{x} = (\mathbf {r}_{00}i+\mathbf {r}_{01}j+\mathbf {r}_{03}) = \mathrm{d}\mathbf {v}_{x}\), \(\alpha _{y} = (\mathbf {r}_{10}i+\mathbf {r}_{11}j\) \(+ \mathbf {r}_{13}) = \mathrm{d}\mathbf {v}_{y}\) and \(\gamma =(\mathbf {r}_{30}i+\mathbf {r}_{31}j +\mathbf {r}_{33})=\mathrm{d}\mathbf {v}_{w}\). In conclusion, the gradient is:
1.3 Curvature term
Proceeding as for the smoothness term:
which, after a trivial but tedious derivation, gives:
Appendix B: An example of handling discontinuities with the LUT table
In this section, we show how the coefficients of the LUT table are derived in a specific case. Let us consider, Eq. (14) for the gradient of the smoothness term, which is a weighted sum of second derivatives, and consider one of the terms of the sum:
If \(\mathbf {z}_{xx}(n-2,m)\) is computed by central finite differences, we have:
In other words, since \(z_{m,n}\) does not appear in the computation of \(\mathbf {z}_{xx}(n-2,m)\) the derivative on \(z_{m,n}\), and thus A, is zero. Referring to Fig. 7, this is because the entry for this configuration (first row) is null.
On the other hand, if \(\mathbf {z}_{xx}(n-2,m)\) is computed by forward finite differences, we have:
and thus:
This gives the coefficients to apply as just shown in Fig. 7 (second row).
Rights and permissions
About this article
Cite this article
Baldacci, A., Bernabei, D., Corsini, M. et al. 3D reconstruction for featureless scenes with curvature hints. Vis Comput 32, 1605–1620 (2016). https://doi.org/10.1007/s00371-015-1144-5
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00371-015-1144-5