Vectorizing World Buildings: Planar Graph Reconstruction by Primitive Detection and Relationship Inference

Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 12353)


This paper tackles a 2D architecture vectorization problem, whose task is to infer an outdoor building architecture as a 2D planar graph from a single RGB image. We provide a new benchmark with ground-truth annotations for 2,001 complex buildings across the cities of Atlanta, Paris, and Las Vegas. We also propose a novel algorithm utilizing 1) convolutional neural networks (CNNs) that detects geometric primitives and infers their relationships and 2) an integer programming (IP) that assembles the information into a 2D planar graph. While being a trivial task for human vision, the inference of a graph structure with an arbitrary topology is still an open problem for computer vision. Qualitative and quantitative evaluations demonstrate that our algorithm makes significant improvements over the current state-of-the-art, towards an intelligent system at the level of human perception. We will share code and data.


Vectorization Remote sensing Deep learning Planar graph 



This research is partially supported by NSERC Discovery Grants, NSERC Discovery Grants Accelerator Supplements, and DND/NSERC Discovery Grant Supplement. This research is also supported by the Intelligence Advanced Research Projects Activity (IARPA) via Department of Interior/Interior Business Center (DOI/IBC) contract number D17PC00288. The U.S. Government is authorized to reproduce and distribute reprints for Governmental purposes notwithstanding any copyright annotation thereon. The views and conclusions contained herein are those of the authors and should not be interpreted as necessarily representing the official policies or endorsements, either expressed or implied, of IARPA, DOI/IBC, or the U.S. Government.

Supplementary material

504445_1_En_42_MOESM1_ESM.pdf (16.9 mb)
Supplementary material 1 (pdf 17335 KB)


  1. 1.
    SpaceNet on Amazon Web Services (AWS). “Datasets.” The SpaceNet Catalog. Last modified April 30, 2018. Accessed 19 Oct 2018
  2. 2.
    Acuna, D., Ling, H., Kar, A., Fidler, S.: Efficient interactive annotation of segmentation datasets with polygon-rnn++. In: Proceedings of the IEEE conference on Computer Vision and Pattern Recognition, pp. 859–868 (2018)Google Scholar
  3. 3.
    Canny, J.: A computational approach to edge detection. IEEE Trans. Pattern Anal. Mach. Intell. 6, 679–698 (1986)Google Scholar
  4. 4.
    Cao, Z., Hidalgo, G., Simon, T., Wei, S.E., Sheikh, Y.: Openpose: realtime multi-person 2d pose estimation using part affinity fields. arXiv preprint arXiv:1812.08008 (2018)
  5. 5.
    Cao, Z., Simon, T., Wei, S.E., Sheikh, Y.: Realtime multi-person 2D pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017)Google Scholar
  6. 6.
    Chao, Y.-W., Choi, W., Pantofaru, C., Savarese, S.: Layout estimation of highly cluttered indoor scenes using geometric and semantic cues. In: Petrosino, A. (ed.) ICIAP 2013. LNCS, vol. 8157, pp. 489–499. Springer, Heidelberg (2013). Scholar
  7. 7.
    Chen, J., Liu, C., Wu, J., Furukawa, Y.: Floor-sp: inverse cad for floorplans by sequential room-wise shortest path. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2661–2670 (2019)Google Scholar
  8. 8.
    Cheng, D., Liao, R., Fidler, S., Urtasun, R.: Darnet: deep active ray network for building segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7431–7439 (2019)Google Scholar
  9. 9.
    Etten, A.V., Lindenbaum, D., Bacastow, T.M.: Spacenet: A remote sensing dataset and challenge series. arXiv preprint arXiv:1807.01232 (2018)
  10. 10.
    Flint, A., Mei, C., Murray, D., Reid, I.: A dynamic programming approach to reconstructing building interiors. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010. LNCS, vol. 6315, pp. 394–407. Springer, Heidelberg (2010). Scholar
  11. 11.
    Flint, A., Murray, D., Reid, I.: Manhattan scene understanding using monocular, stereo, and 3D features. In: 2011 International Conference on Computer Vision, pp. 2228–2235. IEEE (2011)Google Scholar
  12. 12.
    Furukawa, Y., Curless, B., Seitz, S.M., Szeliski, R.: Manhattan-world stereo. In: 2009 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1422–1429. IEEE (2009)Google Scholar
  13. 13.
    Hamaguchi, R., Hikosaka, S.: Building detection from satellite imagery using ensemble of size-specific detectors. In: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), pp. 223–2234. IEEE (2018)Google Scholar
  14. 14.
    He, K., Gkioxari, G., Dollár, P., Girshick, R.: Mask r-cnn. In: Computer Vision (ICCV), 2017 IEEE International Conference on, pp. 2980–2988. IEEE (2017)Google Scholar
  15. 15.
    Hedau, V., Hoiem, D., Forsyth, D.: Recovering the spatial layout of cluttered rooms. In: Computer vision, 2009 IEEE 12th international conference on, pp. 1849–1856. IEEE (2009)Google Scholar
  16. 16.
    Huang, K., Wang, Y., Zhou, Z., Ding, T., Gao, S., Ma, Y.: Learning to parse wireframes in images of man-made environments. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 626–635 (2018)Google Scholar
  17. 17.
    Lee, C.Y., Badrinarayanan, V., Malisiewicz, T., Rabinovich, A.: Roomnet: end-to-end room layout estimation. arXiv preprint arXiv:1703.06241 (2017)
  18. 18.
    Lin, H., et al.: Semantic decomposition and reconstruction of residential scenes from lidar data. ACM Trans. Graph. (TOG) 32(4), 66 (2013)Google Scholar
  19. 19.
    Liu, C., Wu, J., Kohli, P., Furukawa, Y.: Raster-to-vector: revisiting floorplan transformation. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2195–2203 (2017)Google Scholar
  20. 20.
    Liu, C., Wu, J., Furukawa, Y.: Floornet: a unified framework for floorplan reconstruction from 3D scans. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 201–217 (2018)Google Scholar
  21. 21.
    Liu, H., Zhang, J., Zhu, J., Hoi, S.: Deepfacade: a deep learning approach to facade parsing. pp. 2301–2307 (2017)
  22. 22.
    Martinović, A., Mathias, M., Weissenberg, J., Van Gool, L.: A three-layered approach to facade parsing. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012. LNCS, vol. 7578, pp. 416–429. Springer, Heidelberg (2012). Scholar
  23. 23.
    Nishida, G., Bousseau, A., Aliaga, D.G.: Procedural modeling of a building from a single image. Comput. Graph. Forum 37, 415–429 (2018)Google Scholar
  24. 24.
    Parish, Y.I., Müller, P.: Procedural modeling of cities. In: Proceedings of the 28th annual conference on Computer graphics and interactive techniques, pp. 301–308. ACM (2001)Google Scholar
  25. 25.
    Schwing, A.G., Hazan, T., Pollefeys, M., Urtasun, R.: Efficient structured prediction for 3D indoor scene understanding. In: Computer Vision and Pattern Recognition (CVPR), 2012 IEEE Conference on, pp. 2815–2822. IEEE (2012)Google Scholar
  26. 26.
    Szeliski, R.: Computer Vision: Algorithms and Applications. Springer Science & Business Media, Springer, London (2010). Scholar
  27. 27.
    Von Gioi, R.G., Jakubowicz, J., Morel, J.M., Randall, G.: Lsd: a line segment detector. Image Process. Line 2, 35–55 (2012)Google Scholar
  28. 28.
    Yu, F., Koltun, V., Funkhouser, T.A.: Dilated residual networks. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 636–644 (2017)Google Scholar
  29. 29.
    Zeng, H., Wu, J., Furukawa, Y.: Neural procedural reconstruction for residential buildings. In: The European Conference on Computer Vision (ECCV), pp. 737–753 (2018)Google Scholar
  30. 30.
    Zhang, Z., et al.: Ppgnet: learning point-pair graph for line segment detection. arXiv preprint arXiv:1905.03415 (2019)
  31. 31.
    Zhou, Y., Qi, H., Ma, Y.: End-to-end wireframe parsing. arXiv preprint arXiv:1905.03246 (2019)

Copyright information

© Springer Nature Switzerland AG 2020

Authors and Affiliations

  1. 1.Simon Fraser UniversityBurnabyCanada

Personalised recommendations