Advertisement

International Journal of Computer Vision

, Volume 104, Issue 1, pp 94–116 | Cite as

3D Scene Reconstruction from Multiple Spherical Stereo Pairs

  • Hansung KimEmail author
  • Adrian Hilton
Article

Abstract

We propose a 3D environment modelling method using multiple pairs of high-resolution spherical images. Spherical images of a scene are captured using a rotating line scan camera. Reconstruction is based on stereo image pairs with a vertical displacement between camera views. A 3D mesh model for each pair of spherical images is reconstructed by stereo matching. For accurate surface reconstruction, we propose a PDE-based disparity estimation method which produces continuous depth fields with sharp depth discontinuities even in occluded and highly textured regions. A full environment model is constructed by fusion of partial reconstruction from spherical stereo pairs at multiple widely spaced locations. To avoid camera calibration steps for all camera locations, we calculate 3D rigid transforms between capture points using feature matching and register all meshes into a unified coordinate system. Finally a complete 3D model of the environment is generated by selecting the most reliable observations among overlapped surface measurements considering surface visibility, orientation and distance from the camera. We analyse the characteristics and behaviour of errors for spherical stereo imaging. Performance of the proposed algorithm is evaluated against ground-truth from the Middlebury stereo test bed and LIDAR scans. Results are also compared with conventional structure-from-motion algorithms. The final composite model is rendered from a wide range of viewpoints with high quality textures.

Keywords

3D reconstruction Environment modelling Disparity estimation 3D registration and mesh integration 

Notes

Acknowledgments

This research was executed with the financial support of the EU FP7 project i3DPost, UK TSB project SyMMM and EU ICT FP7 project IMPART.

Supplementary material

11263_2013_616_MOESM1_ESM.mov (28.3 mb)
Supplementary material 1 (mov 28970 KB)

References

  1. Agarwal, S., Snavely, N., Simon, I., Seitz, S., & Szeliski, R. (2009). Building rome in a day. In Proceedings of ICCV, pp. 72–79.Google Scholar
  2. Aiger, D., Mitra, N., & Cohen-Or, D. (2008). 4-points congruent sets for robust surface registration. In Proceedings of SIGGRAPH, pp. 1–10.Google Scholar
  3. Akbarzadeh, A., Frahm, J.-M., Mordohai, P., Clipp, B., Engels, C., Gallup, D., Merrell, P., Phelps, M., Sinha, S., Talton, B., Wang, L., Yang, Q., Stewenius, H., Yang, R., Welch, G., Towles, H., Nister, D., & Pollefeys, M. (2006). Towards urban 3d reconstruction from video. In Proceedings of 3DPVT, pp. 1–8.Google Scholar
  4. Alvarez, L., Deriche, R., Papadopoulo, T., & Sánchez, J. (2007). Symmetrical dense optical flow estimation with oclussions detection. International Journal of Computer Vision, 75(3), 371–385.CrossRefGoogle Scholar
  5. Alvarez, L., Deriche, R., Sánchez, J., & Weickert, J. (2002). Dense disparity map estimation respecting image discontinuities: A pde and scale-space based approach. Journal of Visual Communication and Image Representation, 13(1), 3–21.CrossRefGoogle Scholar
  6. Anguelov, D., Dulong, C., Filip, D., Frueh, C., Lafon, S., Lyon, R., et al. (2010). Google street view: Capturing the world at street level. IEEE Computer, 43(6), 32–38.Google Scholar
  7. Asai, T., Kanbara, M., & Yokoya, N. (2005). 3d modeling of outdoor environments by integrating omnidirectional range and color images. In Proceedings of 3DIM, pp. 447–454.Google Scholar
  8. Banno, A., & Ikeuchi, K. (2009). Disparity map refinement and 3d surface smoothing via directed anisotropic diffusion. In Proceedings of 3DIM.Google Scholar
  9. Banno, A., & Ikeuchi, K. (2010). Omnidirectional texturing based on robust 3d registration through euclidean reconstruction from two spherical images. Computer Vision and Image Understanding, 114(4), 491–499.CrossRefGoogle Scholar
  10. Bay, H., Ess, A., Tuytelaars, T., & Gool, L. (2008). Surf: Speeded up robust features. Computer Vision and Image Understanding, 110, 346–359.CrossRefGoogle Scholar
  11. Ben-Ari, R., & Sochen, N. (2007). Variational stereo vision with sharp discontinuities and occlusion handling. In Proceedings of ICCV, pp. 1–7.Google Scholar
  12. Benosman, R., & Devars, J. (1998). Panoramic stereovision sensor. In Proceedings of ICPR, pp. 767–769.Google Scholar
  13. Besl, P., & McKay, N. (1992). A method for registration of 3-d shapes. IEEE Transactions on Pattern Analysis and Machine Intelligence, 14(2), 239–256.CrossRefGoogle Scholar
  14. Brox, T., Bruhn, A., Papenberg, N., & Weickert, J. (2004). High accuracy optical flow estimation based on a theory for warping. In Proceedings of ECCV, pp. 25–36.Google Scholar
  15. Burt, P. J. (1981). Fast filter transforms for image processing. Computer Vision, Graphics and Image Processing, 6, 20–51.Google Scholar
  16. Chen, S. (1995). Quicktime vr—An image based approach to virtual environment navigation. In Proceedings of SIGGRAPH, pp. 29–38.Google Scholar
  17. Chen, Y., & Medioni, G. (1992). Object modeling by registration of multiple range images. Image and Vision Computing, 10(3), 145–155.Google Scholar
  18. Cornelis, N., Leibe, B., Cornelis, K., & Gool, L. (2008). 3d urban scene modeling integrating recognition and reconstruction. International Journal of Computer Vision, 78(2), 121–141.CrossRefGoogle Scholar
  19. Dellaert, F., Seitz, S., Thorpe, C., & Thrun, S. (2000). Structure from motion without correspondence. In Proceedings of CVPR.Google Scholar
  20. Desouza, G., & Kak, A. (2002). Vision for mobile robot navigation: A survey. IEEE Transactions on Pattern Analysis and Machine Intelligence, 24(2), 237–267.CrossRefGoogle Scholar
  21. Feldman, D., & Weinshall, D. (2005). Realtime ibr with omnidirectional crossed-slits projection. In Proceedings of ICCV, pp. 839–845.Google Scholar
  22. Fischler, M., & Bolles, R. (1982). Random sample consensus: A paradigm for model fitting with applications to image analysis and automated cartography. Communication of the ACM, 24, 381–395.MathSciNetCrossRefGoogle Scholar
  23. Fisher, R. (2007). Registration and fusion of range images. Cvonline, http://homepages.inf.ed.ac.uk/rbf/CVonline/LOCAL_COPIES/FISHER/REGIS/regis.html.
  24. Frahm, J.-M., Fite-Georgel, P., Gallup, D., Johnson, T., Raguram, R., Wu, C., Jen, Y.-H., Dunn, E., Clipp, B., Lazebnik, S., & Pollefeys, M. (2010). Building rome on a cloudless day. In Proceedings of ECCV, pp. 368–381.Google Scholar
  25. Furukawa, Y., Curless, B., Seitz, S., & Szeliski, R. (2009). Manhattan-world stereo. In Proceedings of CVPR.Google Scholar
  26. Furukawa, Y., Curless, B., Seitz, S., & Szeliski, R. (2010). Towards internet-scale multi-view stereo. In Proceedings of CVPR.Google Scholar
  27. Furukawa, Y., & Ponce, J. (2010). Accurate, dense, and robust multiview stereopsis. IEEE Transactions on Pattern Analysis and Machine Intelligence, 32(8), 1362–1376.CrossRefGoogle Scholar
  28. Gargallo, P., & Sturm, P. (1988). Bayesian 3d modeling from images using multiple depth maps. In Proceedings of CVPR, pp. 885–891.Google Scholar
  29. Geman, S., & McClure, D. (1985). Bayesian image analysis: An application to single photon emission tomography. In Proceedings of Statistical Computation Section, pp. 12–18.Google Scholar
  30. Goesele, M., Snavely, N., Curless, B., Hoppe, H., & Seitz, S.M. (2007). Multi-view stereo for community photo collections. In Proceedings of ICCV, pp. 368–381.Google Scholar
  31. Granger, S., Pennec, X., & Roche, X. (2001). Rigid point-surface registration using oriented points and an em variant of icp for computer guided oral implantology. In Proceedings of MICCAI, pp. 752–761.Google Scholar
  32. Haala, N., & Kada, M. (2005). Panoramic scenes for texture mapping of 3d city models. In Proceedings of PanoPhot.Google Scholar
  33. Hilton, A. (2005). Scene modelling from sparse 3d data. Image and Vision Computing, 23(10), 900–920.CrossRefGoogle Scholar
  34. Hilton, A., Stoddart, A., Illingworth, J., & Windeatt, T. (1998). Implicit surface based geometric fusion. Computer Vision and Image Understanding, 69(3), 273–291.CrossRefGoogle Scholar
  35. Hirschmüller, H., & Scharstein, D. (2008). Evaluation of stereo matching costs on images with radiometric differences. IEEE Transactions on Pattern Analysis and Machine Intelligence, 31(9), 1582–1599.Google Scholar
  36. Ince, S., & Konrad, J. (2008). Occlusion-aware optical flow estimation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 17, 1443–1451.MathSciNetGoogle Scholar
  37. Johnson, C. (1988). Numerical solution of partial differential equations by the finite element method. Cambridge: Cambridge University Press.Google Scholar
  38. Kang, S., & Szeliski, R. (1997). 3-d scene data recovery using omnidirectional multibaseline stereo. International Journal of Computer Vision, 25(2), 167–183.CrossRefGoogle Scholar
  39. Kazhdan, M., Bolitho, M., & Hoppe, H. (2006). Poisson surface reconstruction. In Proceedings of SGP, pp. 61–70.Google Scholar
  40. Kim, H., & Hilton, A. (2009). 3d environment modelling using spherical stereo imaging. In Proceedings of 3DIM.Google Scholar
  41. Kim, H., & Hilton, A. (2010). 3d modelling of static environments using multiple spherical stereo. In Proceedings of RMLE workshop in ECCV.Google Scholar
  42. Kim, H., & Sohn, K. (2003a). Hierarchical depth estimation for image synthesis in mixed reality. In Proceedings of SPIE Electronic Imaging, pp. 544–553.Google Scholar
  43. Kim, H., Sohn, K. (2003b). Hierarchical disparity estimation with energy-based regularization. In Proceedings of ICIP, pp. 373–376.Google Scholar
  44. Klaus, A., Sormann, M., & Karner, K. (2006). Segment-based stereo matching using belief propagation and a self-adapting dissimilarity measure. In Proceedings of ICPR.Google Scholar
  45. Kolmogorov, V., & Zabih, R. (2001). Computing visual correspondence with occlusions using graph cuts. In Proceedings of ICCV.Google Scholar
  46. Lemmens, M. (2007). Airborne lidar sensor. GIM International, 21(2), 13–17.Google Scholar
  47. Lhuillier, M. (2008). Automatic scene structure and camera motion using a catadioptric system. Computer Vision and Image Understanding, 109(2), 186–203.CrossRefGoogle Scholar
  48. Li, S. (2006). Real-time spherical stereo. In Proceedings of ICPR, pp. 1046–1049.Google Scholar
  49. Mathias, M., Martinovic, A., Weissenberg, J., & Gool, L. J. V. (2011). Procedural 3d building reconstruction using shape grammars and detectors. In Proceedings of 3DIMPVT, pp. 304–311.Google Scholar
  50. Merrell, P., Akbarzadeh, A., Wang, L., Mordohai, P., Frahm, J.-M., Yang, R., et al. (2007). Real-time visibility-based fusion of depth maps. In Proceedings of ICCV.Google Scholar
  51. Micusik, B., & Kosecka, J. (2009). Piecewise planar city 3d modeling from street view panoramic sequences. In Proceedings of CVPR, pp. 2906–2912.Google Scholar
  52. Micusik, B., Martinec, D., & Pajdla, T. (2004). 3d metric reconstruction from uncalibrated omnidirectional images. In Proceedings of ACCV.Google Scholar
  53. Min, D., & Sohn, K. (2008). Cost aggregation and occlusion handling with wls in stereo matching. IEEE Transactions on Image Processing, 17(8), 1431–1442.MathSciNetCrossRefGoogle Scholar
  54. Nagel, H., & Enkelmann, W. (1986). An investigation of smoothness constraints for the estimation of displacements vector fields from image sequences. IEEE Transactions on Pattern Analysis and Machine Intelligence, 8, 565–593.CrossRefGoogle Scholar
  55. Nayar, S.K., & Karmarkar, A. (2000). 360 x 360 mosaics. In Proceedings of CVPR, pp. 2388–2388.Google Scholar
  56. Pollefeys, M., Koch, R., Vergauwen, M., & Gool, L. (2000). Automated reconstruction of 3d scenes from sequences of images. ISPRS Journal of Photogrammetry and Remote Sensing, 55(4), 251–267.CrossRefGoogle Scholar
  57. Pollefeys, M., Nistér, D., Frahm, J., Akbarzadeh, A., Mordohai, P., Clipp, B., et al. (2008). Detailed real-time urban 3d reconstruction from video. International Journal of Computer Vision, 78(2), 143–167.Google Scholar
  58. Rusinkiewicz, S., & Levoy, M. (2001). Efficient variants of the icp algorithm. In Proceedings of 3DIM, pp. 145–152.Google Scholar
  59. Salman, N., & Yvinec, M. (2009). Surface reconstruction from multi-view stereo. In Proceedings of ACCV.Google Scholar
  60. Scharstein, D., & Szeliski, R. (2002). A taxonomy and evaluation of dense two-frame stereo correspondence algorithms. International Journal of Computer Vision, 47(1), 7–42.zbMATHCrossRefGoogle Scholar
  61. Simon, L., Teboul, O., Koutsourakis, P., & Paragios, N. (2011). Random exploration of the procedural space for single-view 3d modeling of buildings. International Journal of Computer Vision, 93(2), 253–271.Google Scholar
  62. Sizintsev, M. (2008). Hierarchical stereo with thin structures and transparency. In Proceedings of CRV, pp. 97–104.Google Scholar
  63. Slesareva, N., Bruhn, A., & Weickert, J. (2005). Optic flow goes stereo: A variational method for estimating discontinuity- preserving dense disparity maps. In Proceedings of DAGM, pp. 33–40.Google Scholar
  64. Snavely, N., Seitz, S., & Szeliski, R. (2006). Photo tourism: Exploring photo collections in 3d. In Proceedings of ACM SIGGRAPH, pp. 835–846.Google Scholar
  65. Snavely, N., Seitz, S., & Szeliski, R. (2008). Modeling the world from internet photo collections. International Journal of Computer Vision, 80(2), 189–210.CrossRefGoogle Scholar
  66. Soucy, M., & Laurendeau, D. (1995). A general surface approach to the integration of a set of range views. IEEE Transactions on Pattern Analysis and Machine Intelligence, 17(4), 344–358. Google Scholar
  67. Strecha, C., Fransens, R., & Gool, L. J. V. (2004). Wide-baseline stereo from multiple views: A probabilistic account. In Proceedings of CVPR, pp. 552–559.Google Scholar
  68. Strecha, C., Hansen, W., Gool, L., Fua, P., & Thoennessen, U. (2008). On benchmarking camera calibration and multi-view stereo for high resolution imagery. In Proceedings of CVPR, pp. 1–8.Google Scholar
  69. Sun, D., Roth, S., Lewis, J., & Black, M. (2008). Learning optical flow. In Proceedings of ECCV, pp. 83–97.Google Scholar
  70. Sun, J., Zheng, N., & Shum, H. (2003). Stereo matching using belief propagation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 25(7), 787–800.CrossRefGoogle Scholar
  71. Szeliski, R., & Scharstein, D. (2004). Sampling the disparity space image. IEEE Transactions on Pattern Analysis and Machine Intelligence, 26(3), 419–425.CrossRefGoogle Scholar
  72. Teller, S., Antone, M., Bodnar, Z., Bosse, M., Coorg, S., Jethwa, M., et al. (2003). Calibrated, registered images of an extended urban area. International Journal of Computer Vision, 53(1), 93–107.CrossRefGoogle Scholar
  73. Tighe, J., Feldman, J., & Lazebnik, S. (2010). SuperParsing: Scalable Nonparametric Image Parsing with Superpixels. In Proceedings of ECCV.Google Scholar
  74. Turk, G., & Levoy, M. (1994). Zippered polygon meshes from range images. In Proceedings of SIGGRAPH, pp. 311–318.Google Scholar
  75. Vergauwen, M., & Gool, L. (2006). Web-based 3d reconstruction service. Machine Vision Applications, 17, 411–426.CrossRefGoogle Scholar
  76. Vu, H., Keriven, R., Labatut, P., & Pons, J. (2009). Towards high-resolution large-scale multi-view stereo. In Proceedings of CVPR, pp. 1430–1437.Google Scholar
  77. Weickert, J. (1997). A review of nonlinear diffusion filtering. Lecture Notes in Computer Science, 1252, 3–28.Google Scholar
  78. Williams, J., & Bennamoun, M. (2001). Simultaneous registration of multiple corresponding point sets. Computer Vision and Image Understanding, 81(1), 117–142.zbMATHCrossRefGoogle Scholar
  79. Yang, Q., Wang, L., Yang, R., Stewénius, H., & Nistér, D. (2008). Stereo matching with color-weighted correlation, hierarchical belief propagation and occlusion handling. IEEE Transactions on Pattern Analysis and Machine Intelligence, 31(3), 492–504.CrossRefGoogle Scholar
  80. Yuille, A., & Poggio, T. (1984). A generalized ordering constraint for stereo correspondence. MIT A.I. Memo 777.Google Scholar
  81. Zimmer, H., Bruhn, A., Valgaerts, L., Breuß, M., Weickert, J., Rosenhahn, B., & Seidel, H. (2008). Pde-based anisotropic disparity-driven stereo vision. In Proceedings of VMV, pp. 263–272.Google Scholar
  82. Zomet, A., Feldman, D., Peleg, S., & Weinshall, D. (2003). Mosaicing new views: The crossed-slits projection. IEEE Transactions on Pattern Analysis and Machine Intelligence, 25(6), 741–754.CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media New York 2013

Authors and Affiliations

  1. 1.Centre for Vision, Speech and Signal ProcessingUniversity of SurreyGuildford, SurreyUK

Personalised recommendations