Skip to main content
Log in

Underwater Object Recognition Based on Deep Encoding-Decoding Network

  • Published:
Journal of Ocean University of China Aims and scope Submit manuscript

Abstract

Ocean underwater exploration is a part of oceanography that investigates the physical and biological conditions for scientific and commercial purposes. And video technology plays an important role and is extensively applied for underwater environment observation. Different from the conventional methods, video technology explores the underwater ecosystem continuously and non-invasively. However, due to the scattering and attenuation of light transport in the water, complex noise distribution and lowlight condition cause challenges for underwater video applications including object detection and recognition. In this paper, we propose a new deep encoding-decoding convolutional architecture for underwater object recognition. It uses the deep encoding-decoding network for extracting the discriminative features from the noisy low-light underwater images. To create the deconvolutional layers for classification, we apply the deconvolution kernel with a matched feature map, instead of full connection, to solve the problem of dimension disaster and low accuracy. Moreover, we introduce data augmentation and transfer learning technologies to solve the problem of data starvation. For experiments, we investigated the public datasets with our proposed method and the state-of-the-art methods. The results show that our work achieves significant accuracy. This work provides new underwater technologies applied for ocean exploration.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Bonin–Font, F., Oliver, G., Wirth, S., Massot, M., Negre, P. L., and Beltran, J. P., 2015. Visual sensing for autonomous underwater exploration and intervention tasks. Ocean Engineering, 93: 25–44.

    Article  Google Scholar 

  • Boom, B. J., He, J., Palazzo, S., Huang, P. X., Beyan, C., Chou, H. M., Lin, F. P., Spampinato, C., and Fisher, R. B., 2014. A research tool for long–term and continuous analysis of fish assemblage in coral–reefs using underwater camera footage. Ecological Informatics, 23: 83–97.

    Article  Google Scholar 

  • Boom, B. J., Huang, P. X., He, J., and Fisher, R. B., 2012. Supporting ground–truth annotation of image datasets using clustering. 21st International Conference on Pattern Recognition. Tsukuba, Japan, 1542–1545.

    Google Scholar 

  • Cappo, M., Harvey, E., and Shortis, M., 2006. Counting and measuring fish with baited video techniques–An overview. Australian Society for Fish Biology Workshop Proceedings. Hobart, Australia, 101–114.

    Google Scholar 

  • Kim, Y., 2014. Convolutional neural networks for sentence classification. Eprint Arxiv, No. 1408.5882.

    Book  Google Scholar 

  • Krizhevsky, A., Sutskever, I. E., 2012. Imagenet classification with deep convolutional neural networks. Advances in Neural Information Processing Systems. California, USA, 1097–1105.

    Google Scholar 

  • Lines, J., Tillett, R., Ross, L., Chan, D., Hockaday, S., and McFarlane, N., 2001. An automatic image–based system for estimating the mass of free–swimming fish. Computers and Electronics in Agriculture, 31: 151–168.

    Article  Google Scholar 

  • Mao, X. J., Shen, C., and Yang, Y. B., 2016. Image denoising using very deep fully convolutional encoder–decoder networks with symmetric skip connections. Eprint Arxiv, No. 1603.090 56.

    Google Scholar 

  • Noh, H., Hong, S., and Han, B., 2015. Learning deconvolution network for semantic segmentation. Proceedings of the IEEE International Conference on Computer Vision. Santiago, Chile, 1520–1528.

    Book  Google Scholar 

  • Pan, S. J., and Yang, Q., 2010. A survey on transfer learning. IEEE Transactions on Knowledge and Data Engineering, 22: 1345–1359.

    Article  Google Scholar 

  • Pelletier, D., Leleu, K., Mou–Tham, G., Guillemot, N., and Chabanet, P., 2011. Comparison of visual census and high de finition video transects for monitoring coral reef fish assemblages. Fisheries Research, 107: 84–93.

    Article  Google Scholar 

  • Qin, H., Li, X., Liang, J., Peng, Y., and Zhang, C., 2015. DeepFish: Accurate underwater live fish recognition with a deep architecture. Neurocomputing, 187: 49–58.

    Article  Google Scholar 

  • Simonyan, K., and Zisserman, A., 2014. Very deep convolutional networks for large–scale image recognition. Eprint Arxiv, No. 1409.1556.

    Google Scholar 

  • Spampinato, C., Chen–Burger, Y. H., Nadarajan, G., and Fisher, R. B., 2008. Detecting, tracking and counting fish in low quality unconstrained underwater videos. The 3th International Conference on Computer Vision Theory and Applications, 2: 514–519.

    Google Scholar 

  • Spampinato, C., Giordano, D., Di Salvo, R., Chen–Burger, Y. H., Fisher, R. B., and Nadarajan, G., 2010. Automatic fish classification for underwater species behavior understanding. Proceedings of the First ACM International Workshop on Analysis and Retrieval of Tracked Events and Motion in Imagery Streams. Firenze, Italy, 45–50.

    Book  Google Scholar 

  • Struthers, D. P., Danylchuk, A. J., Wilson, A. D., and Cooke, S. J., 2015. Action cameras: Bringing aquatic and fisheries research into view. Fisheries, 40: 502–512.

    Article  Google Scholar 

  • Sun, X., Shi, J., Liu, L., Dong, J., Plant, C., Wang, X., and Zhou, H., 2018. Transferring deep knowledge for object recognition in low–quality underwater videos. Neurocomputing, 275: 897–908.

    Article  Google Scholar 

  • Zeiler, M. D., and Fergus, R., 2014. Visualizing and understanding convolutional networks. European Conference on Computer Vision. Zurich, Switzerland, 818–833.

    Google Scholar 

  • Zeiler, M. D., Taylor, G. W., and Fergus, R., 2011. Adaptive deconvolutional networks for mid and high level feature learning. 2011 IEEE International Conference on Computer Vision. Barcelona, Spain, 2018–2025.

    Google Scholar 

  • Zhang, N., Donahue, J., Girshick, R., and Darrell, T., 2014. Partbased R–CNNs for fine–grained category detection. European Conference on Computer Vision. Zurich, Switzerland, 834–849.

    Google Scholar 

Download references

Acknowledgements

The study is supported by the Jilin Science and Technology Development Plan Project (Nos. 20160209006GX, 20170309001GX and 20180201043GX).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jihong Ouyang.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Wang, X., Ouyang, J., Li, D. et al. Underwater Object Recognition Based on Deep Encoding-Decoding Network. J. Ocean Univ. China 18, 376–382 (2019). https://doi.org/10.1007/s11802-019-3858-x

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11802-019-3858-x

Key words

Navigation