Abstract
This work proposes an approach of image-to-image translation deep learning model called cycle consistent adversarial networks for reconstructing the digital surface model from monocular aerial imagery. The proposed model architecture consists of generators with the encoder–decoder system with skip connection and two discriminators that penalize structures at the scale of patches. The objective function of the cycleGAN has improved by adding L1 loss for training on paired samples. Conditional GAN is used as a baseline model in this study. The proposed approach showed higher reconstruction capabilities for generating a surface model from aerial imagery than previous studies that used conditional GAN. The proposed architecture exhibited a strong potential in reconstructing a surface model from single aerial imagery with the capacity to generalize multiple cities and built-up environments. The results can be useful in urban studies and visualization of urban data for better governance.
References
Ramachandra, T.V.; Aithal, B.H.; Sanna, D.D.: Insights to urban dynamics through landscape spatial pattern analysis. Int. J. Appl. Earth Obs. Geoinf. 18, 329–343 (2012)
Duarte, D.; Nex, F.; Kerle, N.; Vosselman, G.: Satellite image classification of building damages using airborne and satellite image samples in a deep learning approach. ISPRS Ann. Photogramm. Remote Sens. Spatial Inf. Sci. 4(2), 89–96 (2018)
Schuegraf, P.; Bittner, K.: Automatic building footprint extraction from multi-resolution remote sensing images using a hybrid FCN. ISPRS Int. J. Geoinf. 8(4), 191–204 (2019)
Holloway, J.; Mengersen, K.: Statistical machine learning methods and remote sensing for sustainable development goals: a review. Remote Sens. 10(9), 1365–1385 (2018)
Lin, F.C.; Chung, L.K.; Ku, W.Y.; Chu, L.R.; Chou, T.Y.: The framework of cloud computing platform for massive remote sensing images. In: 2013 IEEE 27th International Conference on Advanced Information Networking and Applications, pp. 621–628, Aina (2013)
Chandan, M.C.; Nimish, G.; Bharath, H.A.: Analysing spatial patterns and trend of future urban expansion using SLEUTH. Spatial Inf. Res. 28(1), 11–23 (2020)
Ramachandra, T.V.; Bharath, A.H.; Sowmyashree, M.V.: Monitoring urbanization and its implications in a mega city from space: spatiotemporal patterns and its indicators. J. Environ. Manag. 148, 67–81 (2015)
Toutin, T.: DSM generation and evaluation from QuickBird stereo imagery with 3D physical modelling. Int. J. Remote Sens. 25(22), 5181–5192 (2004)
Stal, C.; Tack, F.; De Maeyer, P.; De Wulf, A.; Goossens, R.: Airborne photogrammetry and lidar for DSM extraction and 3D change detection over an urban area—a comparative study. Int. J. Remote Sens. 34(4), 1087–1110 (2013)
Magnard, C.; Frioud, M.; Small, D.; Brehm, T.; Essen, H.; Meier, E.: Processing of MEMPHIS Ka-band multibaseline interferometric SAR data: from raw data to digital surface models. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 7(7), 2927–2941 (2014)
Liu, F.; Shen, C.; Lin, G.: Deep convolutional neural fields for depth estimation from a single image. In: 2015 Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5162–5170
Eigen, D.; Puhrsch, C.; Fergus, R.: Depth map prediction from a single image using a multi-scale deep network. In: 2014 Advances in Neural Information Processing Systems, pp. 2366–2374
Eigen, D.,: Fergus, R.: Predicting depth, surface normals and semantic labels with a common multi-scale convolutional architecture. In: 2015 Proceedings of the IEEE International Conference on Computer Vision, pp. 2650–2658
Laina, I.; Rupprecht, C.; Belagiannis, V.; Tombari, F.; Navab, N.: Deeper depth prediction with fully convolutional residual networks. In: 2016 Fourth International Conference on 3D Vision, pp. 239–248
Garg, R.; BG, V.K.; Carneiro, G.; Reid, I.: Unsupervised CNN for single view depth estimation: geometry to the rescue. In: 2016 European Conference on Computer Vision, pp. 740–756
Krizhevsky, A.; Sutskever, I.; Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: 2012 Advances in Neural Information Processing Systems, pp. 1097–1105
Godard, C.; Mac Aodha, O.; Brostow, G.J.: Unsupervised monocular depth estimation with left–right consistency. In: 2017 Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 270–279
Isola, P.; Zhu, J.Y.; Zhou, T.; Efros, A.A.: Image-to-image translation with conditional adversarial networks. In: 2017 Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1125–1134
Ian, G.; Jean, P.A.; Mehdi, M.; Bing, X.; David, W.F.; Sherjil, O.; Aaron, C.; Yoshua, B.: Generative adversarial nets. In: 2014 Advances in Neural Information Processing Systems, vol. 3
Wang, C.; Dong, S.; Zhao, X.; Papanastasiou, G.; Zhang, H.; Yang, G.: SaliencyGAN: deep learning semisupervised salient object detection in the fog of IoT. IEEE Trans. Ind. Inf. 16(4), 2667–2676 (2019)
Wang, C.; Papanastasiou, G.; Tsaftaris, S.; Yang, G.; Gray, C.; Newby, D.; MacGillivray, T.: TPSDicyc: improved deformation invariant cross-domain medical image synthesis. In: 2019 International Workshop on Machine Learning for Medical Image Reconstruction, pp. 245–254
Yang, G.; Yu, S.; Dong, H.; Slabaugh, G.; Dragotti, P.L.; Ye, X.; Firmin, D.: DAGAN: deep de-aliasing generative adversarial networks for fast compressed sensing MRI reconstruction. IEEE Trans. Med. Imaging 37(6), 1310–1321 (2019)
Zhu, J.; Yang, G.; Lio, P.: How can we make GAN perform better in single medical image super-resolution? A lesion focused multi-scale approach. In: 2019 IEEE 16th International Symposium on Biomedical Imaging, pp. 1669–1673
Mirza, M.; Osindero, S.: Conditional generative adversarial nets (2014). arXiv:1411.1784
Ghamisi, P.; Yokoya, N.: Img2dsm: height simulation from single imagery using conditional generative adversarial net. IEEE Geosci. Remote Sens. Lett. 15(5), 794–798 (2018)
Bittner, K.; d’Angelo, P.; Körner, M.; Reinartz, P.: Dsm-to-lod2: spaceborne stereo digital surface model refinement. Remote Sens. 10(12), 1926–1936 (2018)
Zhu, J.Y.; Park, T.; Isola, P.; Efros, A.A.: Unpaired image-to-image translation using cycle-consistent adversarial networks. In: 2017 Proceedings of the IEEE International Conference on Computer Vision, pp. 2223–2232
Gokaslan, A.; Ramanujan, V.; Ritchie, D.; In Kim, K.; Tompkin, J.: Improving shape deformation in unsupervised image-to-image translation. In: 2018 Proceedings of the European Conference on Computer Vision (ECCV), pp. 649–665
Acknowledgements
We are grateful to (i) NRDMS, Department of Science and Technology, GOI (ii) Sponsored research in Consultancy cell, Indian Institute of Technology Kharagpur, and (iii) West Bengal Department of Higher Education for the financial and infrastructure support.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Aithal, B.H., Das, S.K. & Subrahmanya, P.P. Urban 3D Structure Reconstruction Through a Generative Adversarial Network Model. Arab J Sci Eng 45, 10731–10741 (2020). https://doi.org/10.1007/s13369-020-04850-7
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s13369-020-04850-7