Challenges in the deep learning-based semantic segmentation of benthic communities from Ortho-images


Since the early days of the low-cost camera development, the collection of visual data has become a common practice in the underwater monitoring field. Nevertheless, video and image sequences are a trustworthy source of knowledge that remains partially untapped. Human-based image analysis is a time-consuming task that creates a bottleneck between data collection and extrapolation. Nowadays, the annotation of biologically meaningful information from imagery can be efficiently automated or accelerated by convolutional neural networks (CNN). Presenting our case studies, we offer an overview of the potentialities and difficulties of accurate automatic recognition and segmentation of benthic species. This paper focuses on the application of deep learning techniques to multi-view stereo reconstruction by-products (registered images, point clouds, ortho-projections), considering the proliferation of these techniques among the marine science community. Of particular importance is the need to semantically segment imagery in order to generate demographic data vital to understand and explore the changes happening within marine communities.

This is a preview of subscription content, access via your institution.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13


  1. 1.


  1. Abraham N, Khan NM (2019) A novel focal tversky loss function with improved attention u-net for lesion segmentation. In: 2019 IEEE 16th International Symposium on Biomedical Imaging (ISBI 2019). IEEE, pp 683–687

  2. Agisoft Metashape (n.d.)

  3. Alonso I, Cambra A, Muoz A, Treibitz T, Murillo AC (2017) Coral-segmentation: Training dense labeling models with sparse ground truth. In 2017 IEEE International Conference on Computer Vision Workshops (ICCVW), pp 2874–2882

  4. Alonso I, Yuval M, Eyal G, Treibitz T, Murillo AC (2019) Coralseg: Learning coral segmentation from sparse annotations. J. Field Robotics 36(8):1456–1477

    Article  Google Scholar 

  5. Beijbom O, Edmunds PJ, Kline DI, Mitchell BG, Kriegman D (2012) Automated annotation of coral reef survey images. In CVPR, pages 1170–1177

  6. Beijbom O, Edmunds PJ, Roelfsema C, Smith J, Kline DI, Neal B-j P, Dunlap MJ, Moriarty V, Fan T-Y, Tan C-J, Chan S, Treibitz T, Gamst A, Mitchell BG, Kriegman D (2015) Towards automated annotation of benthic survey images: Variability of human experts and operational modes of automation. PLOS ONE 10(7):1–22

    Article  Google Scholar 

  7. Beijbom O, Treibitz T, Kline D, Eyal G, Khen A, Neal B, Loya Y, Mitchell B, Kriegman D (2016) Improving automated annotation of benthic survey images using wide-band fluorescence. Scientific Reports 6:23166

    Article  Google Scholar 

  8. Chen L-C, Zhu Y, Papandreou G, Schroff F, Adam H. Encoder-decoder with atrous separable convolution for semantic image segmentation

  9. Choy CB, Xu D, Gwak JY, Chen K, Savarese S (2016) 3d- r2n2: A unified approach for single and multi-view 3d object reconstruction. In: European conference on computer vision. Springer, pp 628–644

  10. Culverhouse PF, Williams R, Reguera B, Herry V, González-Gil S (2003) Do experts make mistakes? A comparison of human and machine indentification of dinoflagellates. Mar Ecol Prog Ser 247:17–25

    Article  Google Scholar 

  11. Dai A, Nießner M (2018) 3dmv: Joint 3d-multi-view prediction for 3d semantic scene segmentation. In: Ferrari V, Hebert M, Sminchisescu C, Weiss Y (eds) Computer Vision – ECCV 2018. Springer International Publishing, Cham, pp 458–474

    Google Scholar 

  12. De’ath G, Fabricius KE, Sweatman H, Puotinen M (2012) The 27–year decline of coral cover on the great barrier reef and its causes. Proc Natl Acad Sci 109(44):17995–17999

    Article  Google Scholar 

  13. Durden JM, Bett BJ, Schoening T, Morris KJ, Nattkemper TW, Ruhl HA (2016) Comparison of image annotation data generated by multiple investigators for benthic ecology. Mar Ecol Prog Ser 552:61–70

    Article  Google Scholar 

  14. Dylan E (2019) McNamara, Nick Cortale, Clinton Edwards, Yoan Eynaud, and Stuart A Sandin. Insights into coral reef benthic dynamics from nonlinear spatial forecasting. Journal of The Royal Society Interface 16(153):20190047

    Article  Google Scholar 

  15. Edwards C, Eynaud Y, Williams GJ, Pedersen NE, Zgliczyn-ski BJ, Gleason ACR, Smith JE, Sandin S (2017) Large-area imaging reveals biologically driven non-random spatial patterns of corals at a remote reef. Coral Reefs 36:1291–1305

    Article  Google Scholar 

  16. Everingham M, Van Gool L, Williams CKI, Winn J, Zisserman A (2010) The pascal vi- sual object classes (voc) challenge. International Journal of Computer Vision 88(2):303–338

    Article  Google Scholar 

  17. Gonzalez-Rivero M, Beijbom O, Rodriguez-Ramirez A, Bryant EP, Ganase A, Gonzalez-Marrero Y, Herrera-Reveles A, Kennedy V, Kim JS, Lopez-Marcano S, Markey K, Neal P, Osborne K, Reyes-Nivia C, Sampayo M, Stolberg K, Taylor A, Vercelloni J, Wyatt M, Hoegh-Guldberg O (2020) Monitoring of coral reefs using artificial intelli- gence: A feasible and cost-effective approach. Remote Sensing 12:489

    Article  Google Scholar 

  18. Goodfellow I, Bengio Y, Courville A (2016) Deep learning. MIT press, Cambridge

  19. Han X, Laga H, Bennamoun M (2019) Image-based 3d object recon- struction: state-of-the-art and trends in the deep learning era. IEEE Trans Pattern Anal Mach Intell:1

  20. Hughes TP (1984) Population dynamics based on individual size rather than age: A general model with a reef coral example. The American Naturalist 123(6):778–795

    Article  Google Scholar 

  21. Hughes TP, Kerry JT, lvarez Noriega M, lvarez Romero JG, Anderson KD, Baird AH, Babcock RC, Beger M, Bellwood DR, Berkelmans R, Bridge TC, Butler IR, Byrne M, Cantin NE, Comeau S, Connolly SR, Cumming GS, Dalton SJ, Diaz-Pulido G, Eakin CM, Figueira WF, Gilmour JP, Harrison HB, Heron SF, Hoey AS, Hobbs J-PA, Hoogenboom MO, Kennedy EV, Kuo C-Y, Lough JM, Lowe RJ, Liu G, McCulloch MT, Malcolm HA, McWilliam MJ, Pandolfi JM, Pears RJ, Pratchett MS, Schoepf V, Simpson T, Skirving WJ, Sommer B, Torda G, Wachenfeld DR, Willis BL, Wilson SK (2017) Global warming and recurrent mass bleaching of corals. Nature 543(7645):373377

    Article  Google Scholar 

  22. Jarrett K, Kavukcuoglu K, Ranzato M, LeCun Y (2009) What is the best multi-stage architecture for object recognition? In: 2009 IEEE 12th International Conference on Com- puter Vision, pp 2146–2153

    Google Scholar 

  23. Kendall A, Badrinarayanan V, Cipolla R (2015) Bayesian segnet: Model un- certainty in deep convolutional encoder-decoder architectures for scene understanding. CoRR abs/1511.02680

  24. Kervadec H, Bouchtiba J, Desrosiers C, Granger É, Dolz J, Ayed I-m B (2019) Boundary loss for highly unbalanced segmentation. In: International Conference on Medical Imaging with Deep Learning – Full Paper Track, London

  25. Khvedchenya E, Iglovikov VI, Buslaev A, Parinov A, Kalinin AA (2018) Albumentations: fast and flexible image augmentations. ArXiv e-prints

  26. King A, Bhandarkar S, and Hopkinson B (2018) A comparison of deep learn- ing methods for semantic segmentation of coral reef survey images. pp 1475–14758

  27. Kohler KE, Gill SM (2006) Coral point count with excel extensions (cpce): A visual basic program for the determination of coral and substrate coverage using random point count methodology. Computers Geosciences 32(9):1259–1269

    Article  Google Scholar 

  28. Mahmood A, Bennamoun M, An S, Sohel F, Boussaid F, Hovey R, Kendrick G, Fisher RB (2016) Automatic annotation of coral reefs using deep learning. In OCEANS 2016 MTS/IEEE Monterey, pp 1–5

  29. Mary AB, Dharma D (2018) Coral reef image/video classification employing novel octa-angled pattern for triangular sub region and pulse coupled convolutional neural network (PCCNN). Multimedia Tools and Applications 77:31545–31579

    Article  Google Scholar 

  30. Mary AB, Dharma D (2019) A novel framework for real-time diseased coral reef image classification. Multimedia Tools and Applications 78:11387–11425

    Article  Google Scholar 

  31. Ninio R, Delean S, Osborne K, Sweatman H (2003) Estimating cover of benthic organ- isms from underwater video images: Variability associated with multiple observers. Mar Ecol Progr Ser 265:107–116

    Article  Google Scholar 

  32. Pavoni G, Corsini M, Callieri M, Palma M, Scopigno R (2019) Semantic segmentation of benthic communities from ortho-mosaic orthos. ISPRS Int Arch Photogramm Remote Sens Spat Inf Sci XLII-2/W10:151–158

    Article  Google Scholar 

  33. Pedersen NE, Edwards CB, Eynaud Y, Gleason ACR, Smith JE, Sandin SA (2019) The influence of habitat and adults on the spatial distribu- tion of juvenile corals. Ecography 42(10):1703–1713

    Article  Google Scholar 

  34. Petrovic V, Vanoni D, Richter A, Levy T, Kuester F (2014) Visualiz- ing high resolution three-dimensional and two-dimensional data of cultural heritage sites. Mediterranean Archaeology and Archaeometry 14:93–100

    Google Scholar 

  35. Riegl B, Edmunds PJ (2020) Urgent need for coral demography in a world where corals are disappearing. Mar Ecol Prog Ser

  36. Tchapmi L, Choy C, Armeni I, Gwak JY, Savarese S (2017) Seg- cloud: Semantic segmentation of 3d point clouds. In: 2017 international conference on 3D vision (3DV). IEEE, pp 537–547

  37. van der Maaten L, Hinton G (2008) Visualizing data using t-sne. J Mach Learn Res 9(Nov):2579–2605

    Google Scholar 

  38. Williams ID, Couch CS, Beijbom O, Oliver TA, Vargas-Angel B, Schumacher BD, Brainard RE (2019) Leveraging automated image anal- ysis tools to transform our capacity to assess status and trends of coral reefs. Frontiers in Marine Science 6:222

    Article  Google Scholar 

  39. Zuiderveld K (1994) Graphics gems iv. chapter. In: Contrast Limited Adaptive Histogram Equal- ization. Academic Press Professional, Inc., San Diego, pp 474–485

    Google Scholar 

  40. Ҫiҫek Ö, Abdulkadir A, Lienkamp SS, Brox T, Ron-neberger O (2016) 3d u-net: Learning dense volumetric segmentation from sparse annotation. CoRR abs/1606.06650

Download references


Authors would like to thank the Sandin Lab (Scripps Institution of Oceanography, UCSD) for the collaboration and for kindly providing all the annotated orthos presented in this study. We thank Marco Callieri for his useful suggestions on how to improve the manuscript.

Author information



Corresponding author

Correspondence to G. Pavoni.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Pavoni, G., Corsini, M., Pedersen, N. et al. Challenges in the deep learning-based semantic segmentation of benthic communities from Ortho-images. Appl Geomat 13, 131–146 (2021).

Download citation


  • Underwater monitoring
  • Coral reef surveys
  • Semantic segmentation
  • Automatic classification
  • Deep Learning