Analysis of Encoder-Decoder Based Deep Learning Architectures for Semantic Segmentation in Remote Sensing Images
Semantic segmentation in remote sensing images is a very challenging task. Each pixel in a remote sensing image has a semantic meaning to it and automatic annotation of each pixel remains as an open challenge for the research community due to its high spatial resolution. To address this issue deep learning based encoder-decoder architectures like SegNet and ResNet that is widely used for computer vision dataset is adopted for remote sensing images and its performance is analyzed based on the pixel wise classification accuracy. From the experiment conducted it is inferred that SegNet suffers from degradation problem when the depth of the network is increased with an overall accuracy of about 86.086% whereas the Residual network manages to overcome the degradation effect with an overall accuracy of about 87.747%.
KeywordsSemantic segmentation Deep learning Encoder-Decoder architectures Remote sensing images
The authors would like to thank DRDO-ERIPR for their funding under research grant no: ERIP/ER/1203080/M/01/1569. The first author would like to thank CSIR for their funding under grant no: 09/1095(0033)18-EMR-I. The Vaihingen dataset is obtained from German society for photogrammetry, Remote Sensing and Geoformation (DGPF) (Cramer 2010): http://www.ifp.uni-stuttgart.de/dgpf/DKEP-Allg.html. The authors thank ISPRS for making the dataset openly available.
- Audebert, N., Le Saux, B., Lefevre, S.: How useful is region-based classification of remote sensing images in a deep learning framework? In: 2016 IEEE International Geoscience and Remote Sensing Symposium (IGARSS), pp. 5091–5094. IEEE (2016a)Google Scholar
- Sun, W., Wang, R.: Fully convolutional networks for semantic segmentation of very high resolution remotely sensed images combined with DSM. IEEE Geosci. Remote Sens. Lett. 15(3), 474–478 (2018)Google Scholar
- Sun, Y., Zhang, X., Xin, Q., Huang, J.: Developing a multi-filter convolutional neural network for semantic segmentation using high-resolution aerial imagery and LiDAR data. ISPRS J. Photogram. Remote Sens. 143, 3–14 (2018)Google Scholar
- Kemker, R., Salvaggio, C., Kanan, C.: Algorithms for semantic segmentation of multispectral remote sensing imagery using deep learning. ISPRS J. Photogramm. Remote. Sens. 145, 60–77 (2018)Google Scholar
- He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)Google Scholar
- Badrinarayanan, V., Kendall, A., Cipolla, R.: Segnet: a deep convolutional encoder-decoder architecture for image segmentation. arXiv preprint arXiv:1511.00561 (2015)
- Audebert, N., Le Saux, B., Lefèvrey, S.: Fusion of heterogeneous data in convolutional networks for urban semantic labeling. In: Urban Remote Sensing Event (JURSE), 2017 Joint, pp. 1–4. IEEE (2017)Google Scholar
- Audebert, N., Le Saux, B., Lefèvre, S.: Semantic segmentation of earth observation data using multimodal and multi-scale deep networks. In: Asian Conference on Computer Vision, pp. 180–196. Springer, Cham (2016b)Google Scholar
- Long, J., Evan, S., Darrell, T.: Fully convolutional networks for semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3431–3440 (2015)Google Scholar
- He, K., Zhang, X., Ren, S., Sun, J.: Delving deep into rectifiers: surpassing human-level performance on imagenet classification. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1026–1034 (2015)Google Scholar
- Cramer, M.: The DGPF-test on digital airborne camera evaluation–overview and test design. Photogrammetrie-Fernerkundung-Geoinformation, 2010(2), 73–82 (2010)Google Scholar
- MATLAB version 9.5.0. Natick, Massachusetts: The MathWorks Inc., September 2018Google Scholar