Towards an efficient 3D model estimation methodology for aerial and ground images


In this paper we propose an efficient approach for automatic generation of 3D models from images based on structure from motion (SfM) and multi-view stereo reconstruction techniques. Current imaging devices are capable of producing high-definition images and are an ubiquitous payload of unmanned aerial vehicles. However, the time required to obtain models quickly becomes prohibitive as the number of images increases. In our approach, which is image-based only, we use meta-data information such as GPS, keypoint filtering and multiple local bundle adjustment refinement instead of global optimization in a novel scheme to speed up the incremental SfM process. The results from real data show that our approach outperforms the time performance of current strategies while maintaining the quality of the resulting model. Experiments with an unorganized set of images were also conducted, and the results show that our method is able to efficiently estimate 3D models from collections of images with reduced re-projection error.

This is a preview of subscription content, access via your institution.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13


  1. 1.

    Commercial off-the-shelf.

  2. 2.

    This value was defined empirically, but it is not critical if it remains within the limits between \({\le }0.10\) or \({\ge }0.90\) according to our experiments.

  3. 3.

    Points that have a triangulation angle higher than 2.0\(^{\circ }\).


  1. 1.

    Agarwal, S., Mierle, K., et al.: Ceres solver. (2015)

  2. 2.

    Agarwal, S., Snavely, N., Simon, I., Seitz, S.M., Szeliski, R.: Building rome in a day. In: ICCV, pp. 72–79 (2009)

  3. 3.

    Bay, H., Ess, A., Tuytelaars, T., Van Gool, L.: Speeded-up robust features (SURF). CVIU 110(3), 346–359 (2008)

    Google Scholar 

  4. 4.

    Bosse, M., Zlot, R., Flick, P.: Zebedee: design of a spring-mounted 3-d range sensor with application to mobile mapping. IEEE Trans. Rob. 28(5), 1104–1119 (2012)

    Article  Google Scholar 

  5. 5.

    Crandall, D., Owens, A., Snavely, N., Huttenlocher, D.: Discrete-continuous optimization for large-scale structure from motion. In: CVPR, pp. 3001–3008 (2011). doi:10.1109/CVPR.2011.5995626

  6. 6.

    Engel, J., Schöps, T., Cremers, D.: LSD-SLAM: Large-scale direct monocular SLAM. In: ECCV (2014)

  7. 7.

    Frahm, J.M., Fite-Georgel, P., Gallup, D., Johnson, T., Raguram, R., Wu, C., Jen, Y.H., Dunn, E., Clipp, B., Lazebnik, S., et al.: Building rome on a cloudless day. In: ECCV, pp. 368–381 (2010)

  8. 8.

    Furukawa, Y., Ponce, J.: Accurate, dense, and robust multi-view stereopsis. PAMI 32(8), 1362–1376 (2010)

    Article  Google Scholar 

  9. 9.

    Hartley, R.I.: In defense of the eight-point algorithm. PAMI 19(6), 580–593 (1997). doi:10.1109/34.601246

    Article  Google Scholar 

  10. 10.

    James, M.R., Robson, S.: Straightforward reconstruction of 3D surfaces and topography with a camera: accuracy and geoscience application. J. Geophys. Res. Earth Surf. 117(F3) (2012)

  11. 11.

    Jeong, Y., Nister, D., Steedly, D., Szeliski, R., Kweon, I.S.: Pushing the envelope of modern methods for bundle adjustment. PAMI 34(8), 1605–1617 (2012). doi:10.1109/TPAMI.2011.256

    Article  Google Scholar 

  12. 12.

    Kazhdan, M., Bolitho, M., Hoppe, H.: Poisson surface reconstruction. In: Proceedings of the fourth Eurographics symposium on geometry processing, vol. 7 (2006)

  13. 13.

    Lowe, D.G.: Distinctive image features from scale-invariant keypoints. IJCV 60, 91–110 (2004). doi:10.1023/B:VISI.0000029664.99615.94

    Article  Google Scholar 

  14. 14.

    Marquardt, D.W.: An algorithm for least-squares estimation of nonlinear parameters. J. Soc. Ind. Appl. Math. 11(2), 431–441 (1963). doi:10.1137/0111030

    Article  MATH  MathSciNet  Google Scholar 

  15. 15.

    Mitra, K., Chellappa, R.: A scalable projective bundle adjustment algorithm using the l infinity norm. In: Sixth Indian conference on computer vision, graphics and image processing, 2008. ICVGIP’08. pp. 79–86. IEEE (2008)

  16. 16.

    Moulon, P., Monasse, P., Marlet, R.: Adaptive structure from motion with a contrario model estimation. In: ACCV, pp. 257–270 (2012)

  17. 17.

    Muja, M., Lowe, D.G.: Fast approximate nearest neighbors with automatic algorithm configuration. In: In VISAPP international conference on computer vision theory and applications, pp. 331–340 (2009)

  18. 18.

    Nister, D., Stewenius, H.: Scalable recognition with a vocabulary tree. In: CVPR, vol. 2, pp. 2161–2168. IEEE (2006)

  19. 19.

    Ozyesil, O., Singer, A.: Robust camera location estimation by convex programming. In: Proceedings of the IEEE CVPR, pp. 2674–2683 (2015)

  20. 20.

    Pollefeys, M., Nistér, D., Frahm, J.M., Akbarzadeh, A., Mordohai, P., Clipp, B., Engels, C., Gallup, D., Kim, S.J., Merrell, P., et al.: Detailed real-time urban 3d reconstruction from video. IJCV 78(2–3), 143–167 (2008). doi:10.1007/s11263-007-0086-4

    Article  Google Scholar 

  21. 21.

    Rublee, E., Rabaud, V., Konolige, K., Bradski, G.: ORB: an efficient alternative to SIFT or SURF. In: ICCV. Barcelona (2011)

  22. 22.

    Schonberger, J.L., Frahm, J.M.: Structure-from-motion revisited. In: Proceedings of the IEEE CVPR, pp. 4104–4113 (2016)

  23. 23.

    Snavely, N., Seitz, S.M., Szeliski, R.: Modeling the world from internet photo collections. IJCV 80(2), 189–210 (2008). doi:10.1007/s11263-007-0107-3

    Article  Google Scholar 

  24. 24.

    Strecha, C., Pylvanainen, T., Fua, P.: Dynamic and scalable large scale image reconstruction. In: CVPR, pp. 406–413. IEEE Computer Society (2010)

  25. 25.

    Strecha, C., Von Hansen, W., Van Gool, L., Fua, P., Thoennessen, U.: On benchmarking camera calibration and multi-view stereo for high resolution imagery. In: Proceedings of the IEEE CVPR, pp. 1–8 (2008)

  26. 26.

    Teza, G., Pesci, A., Ninfo, A.: Morphological analysis for architectural applications: comparison between laser scanning and structure-from-motion photogrammetry. J. Surv. Eng. p. 04016004 (2016). doi:10.1061/(ASCE)SU.1943-5428.0000172

  27. 27.

    Tomasi, C., Kanade, T.: Shape and motion from image streams under orthography: a factorization method. IJCV 9(2), 137–154 (1992). doi:10.1007/BF00129684

    Article  Google Scholar 

  28. 28.

    Umeyama, S.: Least-squares estimation of transformation parameters between two point patterns. PAMI 13(4), 376–380 (1991). doi:10.1109/34.88573

    Article  Google Scholar 

  29. 29.

    Westoby, M.J., Brasington, J., Glasser, N.F., Hambrey, M.J., Reynolds, J.M.: ’Structure-from-Motion’ photogrammetry: a low-cost, effective tool for geoscience applications. Geomorphology 179, 300–314 (2012)

    Article  Google Scholar 

  30. 30.

    Wilson, K., Snavely, N.: Robust global translations with 1dsfm. In: ECCV, pp. 61–75. Springer (2014)

  31. 31.

    Wu, C.: Towards linear-time incremental structure from motion. In: 3DV, pp. 127–134 (2013)

  32. 32.

    Zhu, S., Fang, T., Xiao, J., Quan, L.: Local readjustment for high-resolution 3D reconstruction. In: Proceedings of the IEEE CVPR, pp. 3938–3945 (2014). doi:10.1109/CVPR.2014.503

Download references


We thank the anonymous reviewers for their comments and insightful observations. This work is supported by CAPES, CNPq, FAPEMIG, and Vale Institute of Technology (ITV).

Author information



Corresponding author

Correspondence to Guilherme Potje.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Potje, G., Resende, G., Campos, M. et al. Towards an efficient 3D model estimation methodology for aerial and ground images. Machine Vision and Applications 28, 937–952 (2017).

Download citation


  • Digital elevation model
  • Multi-view stereo
  • Large-scale 3D reconstruction
  • Structure-from-motion