Skip to main content
Log in

A real-time efficient object segmentation system based on U-Net using aerial drone images

  • Special Issue Paper
  • Published:
Journal of Real-Time Image Processing Aims and scope Submit manuscript

Abstract

Real-time object detection and segmentation are considered as one of the fundamental but challenging problems in remote sensing and surveillance applications (including satellite and aerial). Consequently, it performs a crucial role in various management and monitoring applications and has received notable attention in recent years. This paper aims to present a real-time, efficient system in which a deep learning-based model U-Net is explored for multiple object segmentation in aerial drone images. We perform data augmentation and apply transfer learning to enhance the model efficiency. We experimented U-Net segmentation model with different base architectures, including VGG 16, ResNet-50, and MobileNet, and compare their performance. We also compare the results U-Net segmentation model with different base architectures and concludes that the U-Net (MobileNet) achieves good results. The experimental results demonstrate that data augmentation improves the model’s performance by achieving a segmentation accuracy of 92%, 93%, and 95% with base architectures VGG-16, ResNet-50, and MobileNet, respectively.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13

Similar content being viewed by others

Notes

  1. https://www.tugraz.at/index.php?id=22387.

  2. https://www.tugraz.at/index.php?id=22387.

References

  1. Yang, C., Wong, D., Miao, Q., Yang, R.: Advanced geoinformation science. CRC Press, Boca Raton (2010)

    Book  Google Scholar 

  2. Volpi, M., Tuia, D.: Dense semantic labeling of subdecimeter resolution images with convolutional neural networks. IEEE Trans. Geosci. Remote Sens. 55(2), 881 (2016)

    Article  Google Scholar 

  3. Audebert, N., Saux, B. Le., Lefèvrey, S.: Fusion of heterogeneous data in convolutional networks for urban semantic labeling. In: 2017 Joint Urban Remote Sensing Event (JURSE) (IEEE, 2017), pp. 1–4

  4. Mou, L., Zhu, X.X.: RiFCN: recurrent network in fully convolutional network for semantic segmentation of high resolution remote sensing images. arXiv:1805.02091 (2018)

  5. Vakalopoulou, M., Karantzalos, K., Komodakis, N., Paragios, N.: Graph-based registration, change detection, and classification in very high resolution multitemporal remote sensing data. IEEE J. Sel. Top. Appl. Earth Observ. Remote Sens. 9(7), 2940 (2016)

    Article  Google Scholar 

  6. Wu, C., Du, B., Cui, X., Zhang, L.: A post-classification change detection method based on iterative slow feature analysis and Bayesian soft fusion. Remote Sens. Environ. 199, 241 (2017)

    Article  Google Scholar 

  7. Lyu, H., Lu, H., Mou, L.: Learning a transferable change rule from a recurrent neural network for land cover change detection. Remote Sens. 8(6), 506 (2016)

    Article  Google Scholar 

  8. Mou, L., Zhu, X.X.: Spatiotemporal scene interpretation of space videos via deep neural network and tracklet analysis. In: 2016 IEEE International Geoscience and Remote Sensing Symposium (IGARSS) (IEEE, 2016), pp. 1823–1826

  9. Kopsiaftis, G., Karantzalos, K.: Vehicle detection and traffic density monitoring from very high resolution satellite video data. In: 2015 IEEE International Geoscience and Remote Sensing Symposium (IGARSS) (IEEE, 2015), pp. 1881–1884

  10. Pires de Lima, R., Marfurt, K.: Convolutional neural network for remote-sensing scene classification: transfer learning analysis. Remote Sens. 12(1), 86 (2020)

    Article  Google Scholar 

  11. Zaitoun, N.M., Aqel, M.J.: Survey on image segmentation techniques. Procedia Comput. Sci. 65, 797 (2015)

    Article  Google Scholar 

  12. Ahmed, I., Ahmad, M., Nawaz, M., Haseeb, K., Khan, S., Jeon, G.: Efficient topview person detector using point based transformation and lookup table. Comput. Commun. (2019). https://www.sciencedirect.com/science/article/abs/pii/S0140366419305092?via%3Dihub. Accessed 30 Aug 2010

  13. Ullah, K., Ahmed, I., Ahmad, M., Rahman, A.U., Nawaz, M., Adnan, A.: Rotation invariant person tracker using top view. J. Ambient Intell. Humaniz. Comput. (2019). https://doi.org/10.1007/s12652-019-01526-5

  14. Ahmad, M., Ahmed, I., Jeon, G.: An IoT-enabled real-time overhead view person detection system based on cascade-RCNN and transfer learning. J. Real-Time Image Process. 18, 1129–1139 (2021). https://doi.org/10.1007/s11554-021-01103-0

  15. Ahmed, I., Din, S., Jeon, G., Piccialli, F.: Exploring deep learning models for overhead view multiple object detection. IEEE Internet Things J. 7(7), 5737 (2019)

    Article  Google Scholar 

  16. Ahmed, I., Ahmad, M., Khan, F.A., Asif, M.: Comparison of deep-learning-based segmentation models: using top view person images. IEEE Access 8, 136361 (2020)

    Article  Google Scholar 

  17. Minaee, S., Boykov, Y.Y., Porikli, F., Plaza, A.J., Kehtarnavaz, N., Terzopoulos, D.: Image segmentation using deep learning: a survey. IEEE Trans. Pattern Anal. Mach. Intell. (2021). https://doi.org/10.1109/TPAMI.2021.3059968

  18. Ahmed, I., Ahmad, M., Ahmad, A., Jeon, G.: Top view multiple people tracking by detection using deep SORT and YOLOv3 with transfer learning: within 5G infrastructure. Int. J. Mach. Learn. Cybern. (2020). https://doi.org/10.1007/s13042-020-01220-5

    Article  Google Scholar 

  19. Ahmed, I., Jeon, G., Chehri, A., Hassan, M.M.: Adapting Gaussian YOLOv3 with transfer learning for overhead view human detection in smart cities and societies. Sustain. Cities Soc. 70, 102908 (2021)

    Article  Google Scholar 

  20. Ahmed, I., Ahmad, M., Ahmad, A., Jeon, G.: IoT-based crowd monitoring system: Using SSD with transfer learning. Comput. Electric. Eng. 93, 107226 (2021). https://doi.org/10.1016/j.compeleceng.2021.107226https://www.sciencedirect.com/science/article/pii/S0045790621002147

  21. Ahmed, I., Ahmad, M., Ahmad, A., Jeon, G.: IoT-based crowd monitoring system: using SSD with transfer learning. Comput. Electr. Eng. 93, 107226 (2021)

    Article  Google Scholar 

  22. Ronneberger, O., Fischer, P., Brox, T.: U-net: convolutional networks for biomedical image segmentation. In: International Conference on Medical Image Computing and Computer-Assisted Intervention, pp. 234–241 (Springer, 2015)

  23. Chaudhuri, D., Kushwaha, N., Samal, A.: Semi-automated road detection from high resolution satellite images by directional morphological enhancement and segmentation techniques. IEEE J. Sel. Top. Appl. Earth Observ. Remote Sens. 5(5), 1538 (2012)

    Article  Google Scholar 

  24. Zhang, J., Lin, X., Liu, Z., Shen, J.: Semi-automatic road tracking by template matching and distance transformation in urban areas. Int. J. Remote Sens. 32(23), 8331 (2011)

    Article  Google Scholar 

  25. Stankov, K., He, D.C.: Building detection in very high spatial resolution multispectral images using the hit-or-miss transform. IEEE Geosci. Remote Sens. Lett. 10(1), 86 (2012)

    Article  Google Scholar 

  26. Stankov, K., He, D.C.: Detection of buildings in multispectral very high spatial resolution images using the percentage occupancy hit-or-miss transform. IEEE J. Sel. Top. Appl. Earth Observ. Remote Sens. 7(10), 4069 (2014)

    Article  Google Scholar 

  27. Liu, G., Sun, X., Fu, K., Wang, H.: Aircraft recognition in high-resolution satellite images using coarse-to-fine shape prior. IEEE Geosci. Remote Sens. Lett. 10(3), 573 (2012)

    Article  Google Scholar 

  28. Liu, G., Sun, X., Fu, K., Wang, H.: Interactive geospatial object extraction in high resolution remote sensing images using shape-based global minimization active contour model. Pattern Recognit. Lett. 34(10), 1186 (2013)

    Article  Google Scholar 

  29. Martha, T.R., Kerle, N., van Westen, C.J., Jetten, V., Kumar, K.V.: Segment optimization and data-driven thresholding for knowledge-based landslide detection by object-based image analysis. IEEE Trans. Geosci. Remote Sens. 49(12), 4928 (2011)

    Article  Google Scholar 

  30. Leninisha, S., Vani, K.: Water flow based geometric active deformable model for road network. ISPRS J. Photogramm. Remote Sens. 102, 140 (2015)

    Article  Google Scholar 

  31. Ok, A.O., Senaras, C., Yuksel, B.: Automated detection of arbitrarily shaped buildings in complex environments from monocular VHR optical satellite imagery. IEEE Trans. Geosci. Remote Sens. 51(3), 1701 (2012)

    Article  Google Scholar 

  32. Ming, D., Li, J., Wang, J., Zhang, M.: Scale parameter selection by spatial statistics for GeOBIA: using mean-shift based multi-scale segmentation as an example. ISPRS J. Photogramm. Remote Sens. 106, 28 (2015)

    Article  Google Scholar 

  33. Drăguţ, L., Csillik, O., Eisank, C., Tiede, D.: Automated parameterisation for multi-scale image segmentation on multiple layers. ISPRS J. Photogramm. Remote Sens. 88, 119 (2014)

    Article  Google Scholar 

  34. Feizizadeh, B., Tiede, D., Moghaddam, M.R., Blaschke, T.: Systematic evaluation of fuzzy operators for object-based landslide mapping. South-Eastern Eur. J. Earth Observ. Geomat. 3(2s), 219 (2014)

    Google Scholar 

  35. Li, X., Cheng, X., Chen, W., Chen, G., Liu, S.: Identification of forested landslides using LiDar data, object-based image analysis, and machine learning algorithms. Remote Sens. 7(8), 9705 (2015)

    Article  Google Scholar 

  36. Contreras, D., Blaschke, T., Tiede, D., Jilge, M.: Monitoring recovery after earthquakes through the integration of remote sensing, GIS, and ground observations: the case of L’Aquila (Italy). Cartogr. Geogr. Inf. Sci. 43(2), 115 (2016)

    Article  Google Scholar 

  37. Arı, Ç., Aksoy, S.: Detection of compound structures using a Gaussian mixture model with spectral and spatial constraints. IEEE Trans. Geosci. Remote Sens. 52(10), 6627 (2014)

    Article  Google Scholar 

  38. Benedek, C., Shadaydeh, M., Kato, Z., Szirányi, T., Zerubia, J.: Multilayer Markov random field models for change detection in optical remote sensing images. ISPRS J. Photogramm. Remote Sens. 107, 22 (2015)

    Article  Google Scholar 

  39. Dong, Y., Du, B., Zhang, L.: Target detection based on random forest metric learning. IEEE J. Sel. Top. Appl. Earth Observ. Remote Sens. 8(4), 1830 (2015)

    Article  Google Scholar 

  40. Lei, Z., Fang, T., Huo, H., Li, D.: Bi-temporal texton forest for land cover transition detection on remotely sensed imagery. IEEE Trans. Geosci. Remote Sens. 52(2), 1227 (2013)

    Article  Google Scholar 

  41. Zhang, L., Zhang, L., Tao, D., Huang, X.: A multifeature tensor for remote-sensing target recognition. IEEE Geosci. Remote Sens. Lett. 8(2), 374 (2010)

    Article  Google Scholar 

  42. Kembhavi, A., Harwood, D., Davis, L.S.: Vehicle detection using partial least squares. IEEE Trans. Pattern Anal. Mach. Intell. 33(6), 1250 (2010)

    Article  Google Scholar 

  43. Corbane, C., Najman, L., Pecoul, E., Demagistri, L., Petit, M.: A complete processing chain for ship detection using optical satellite imagery. Int. J. Remote Sens. 31(22), 5837 (2010)

    Article  Google Scholar 

  44. Tang, J., Deng, C., Huang, G.B., Zhao, B.: Compressed-domain ship detection on spaceborne optical image using deep neural network and extreme learning machine. IEEE Trans. Geosci. Remote Sens. 53(3), 1174 (2014)

    Article  Google Scholar 

  45. Wang, J., Song, J., Chen, M., Yang, Z.: Road network extraction: a neural-dynamic framework based on deep learning and a finite state machine. Int. J. Remote Sens. 36(12), 3144 (2015)

    Article  Google Scholar 

  46. Malek, S., Bazi, Y., Alajlan, N., AlHichri, H., Melgani, F.: Efficient framework for palm tree detection in UAV images. IEEE J. Sel. Top. Appl. Earth Observ. Remote Sens. 7(12), 4692 (2014)

    Article  Google Scholar 

  47. Ahmad, M., Ahmed, I., Khan, F.A., Qayum, F., Aljuaid, H.: Convolutional neural network-based person tracking using overhead views. Int. J. Distrib. Sensor Netw. 16(6), 1550147720934738 (2020)

    Article  Google Scholar 

  48. Ahmad, M., Ahmed, I., Ullah, K., Khan, I., Adnan, A.: View, robust background subtraction based person’s counting from overhead. In: 9th IEEE Annual Ubiquitous Computing. Electronics Mobile Communication Conference (UEMCON) 2018, pp. 746–752 (2018). https://doi.org/10.1109/UEMCON.2018.8796595

  49. Khan, I., Ahmed, I., Ahmad, M., Ullah, K.: Towards a smart hospital: automated non-invasive patient’s discomfort detection in ward using overhead camera. In: The 9th Annual Information Technology, Electromechanical Engineering and Microelectronics Conference (IEMECON 2019) (2018), pp. 872–878. https://doi.org/10.1109/UEMCON.2018.8796655

  50. Aptoula, E., Ozdemir, M.C., Yanikoglu, B.: Deep learning with attribute profiles for hyperspectral image classification. IEEE Geosci. Remote Sens. Lett. 13(12), 1970 (2016)

    Article  Google Scholar 

  51. Shrestha, S., Vanneschi, L.: Improved fully convolutional network with conditional random fields for building extraction. Remote Sens. 10(7), 1135 (2018)

    Article  Google Scholar 

  52. Chhikara, P., Tekchandani, R., Kumar, N., Chamola, V., Guizani, M.: DCNN-GA: a deep neural net architecture for navigation of UAV in indoor environment. IEEE Internet Things J. 86, 4448–4460 (2020). https://doi.org/10.1109/JIOT.2020.3027095

  53. Jain, A., Ramaprasad, R., Narang, P., Mandal, M., Chamola, V., Yu, F., Guizani, M.: AI-enabled object detection in UAVs: challenges, design choices, and research directions. IEEE Netw. 35(4), 129–135. https://doi.org/10.1109/MNET.011.2000643

  54. Ševo, I., Avramović, A.: Convolutional neural network based automatic object detection on aerial images. IEEE Geosci. Remote Sens. Lett. 13(5), 740 (2016)

    Article  Google Scholar 

  55. Audebert, N., Saux, B. Le., Lefèvre, S.: Semantic segmentation of earth observation data using multimodal and multi-scale deep networks. In: Asian Conference on Computer Vision, pp. 180–196 (Springer, 2016)

  56. Garg, P., Chakravarthy, A.S., Mandal, M., Narang, P., Chamola, V., Guizani, M.: Isdnet: Ai-enabled instance segmentation of aerial scenes for smart cities. ACM Trans. Internet Technol. 1(3), 1–18 (2020). https://doi.org/10.1145/3418205

  57. Marmanis, D., Wegner, J.D., Galliani, S., Schindler, K., Datcu, M., Stilla, U.: Semantic segmentation of aerial images with an ensemble of CNSS. ISPRS Ann. Photogramm. Remote Sens. Spat. Inf. Sci. 3, 473 (2016)

    Article  Google Scholar 

  58. Abdollahi, A., Pradhan, B., Alamri, A.M.: An ensemble architecture of deep convolutional Segnet and Unet networks for building semantic segmentation from high-resolution aerial images. Geocarto Int 1–16 (2020). https://doi.org/10.1080/10106049.2020.1856199

  59. Marcu, A., Costea, D., Licaret, V., Leordeanu, M.: Towards automatic annotation for semantic segmentation in drone videos. arXiv:1910.10026 (2019)

  60. Garg, L., Shukla, P., Singh, S.K., Bajpai, V., Yadav, U.: Land use land cover classification from satellite imagery using mUnet: a modified Unet architecture. In: VISIGRAPP (4: VISAPP) (2019), pp. 359–365

  61. Chhor, G., Aramburu, C.B., Bougdal-Lambert, I.: Satellite image segmentation for building detection using U-Net. http://cs229.stanford.edu/proj2017/final-reports/5243715.pdf (2017)

  62. Mou, L., Zhu, X.X.: Vehicle instance segmentation from aerial image and video using a multitask learning residual fully convolutional network. IEEE Trans. Geosci. Remote Sens. 56(11), 6699 (2018)

    Article  Google Scholar 

  63. Zhao, X., Yuan, Y., Song, M., Ding, Y., Lin, F., Liang, D., Zhang, D.: Use of unmanned aerial vehicle imagery and deep learning unet to extract rice lodging. Sensors 19(18), 3859 (2019)

    Article  Google Scholar 

  64. Anand, T., Sinha, S., Mandal, M., Chamola, V., Yu, F.R.: AgriSegNet: deep aerial semantic segmentation framework for IoT-assisted precision agriculture. IEEE Sens. J. 21(16), 17581–17590 (2021). https://doi.org/10.1109/JSEN.2021.3071290

  65. Hou, Y., Liu, Z., Zhang, T., Li, Y.: C-UNet: complement UNet for remote sensing road extraction. Sensors 21(6), 2153 (2021)

    Article  Google Scholar 

  66. Ahmad, M., Ahmed, I., Jeon, G.: An IoT-enabled real-time overhead view person detection system based on cascade-RCNN and transfer learning. J. Real-Time Image Process. 18, 1129–1139 (2021). https://doi.org/10.1007/s11554-021-01103-0

  67. Ahmed, I., Jeon, G.: A real-time person tracking system based on SiamMask network for intelligent video surveillance. J. Real-Time Image Process. 1–12 (2021). https://doi.org/10.1007/s11554-021-01144-5

  68. Ahmed, I., Ahmad, M., Rodrigues, J.J., Jeon, G.: Edge computing-based person detection system for top view surveillance: using CenterNet with transfer learning. Appl. Soft Comput. 107, 107489 (2021)

    Article  Google Scholar 

  69. Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv:1409.1556 (2014)

  70. Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., Fei-Fei, L.: Imagenet: A large-scale hierarchical image database. In: 2009 IEEE Conference on Computer Vision and Pattern Recognition (IEEE, 2009), pp. 248–255

  71. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2016), pp. 770–778

  72. Howard, A.G., Zhu, M., Chen, B., Kalenichenko, D., Wang, W., Weyand, T., Andreetto, M., Adam, H.: Mobilenets: efficient convolutional neural networks for mobile vision applications. arXiv:1704.04861 (2017)

Download references

Acknowledgement

This work was supported by Incheon National University Research Concentration Professors Grant in 2020.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Gwanggil Jeon.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Ahmed, I., Ahmad, M. & Jeon, G. A real-time efficient object segmentation system based on U-Net using aerial drone images. J Real-Time Image Proc 18, 1745–1758 (2021). https://doi.org/10.1007/s11554-021-01166-z

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11554-021-01166-z

Keywords

Navigation