Abstract
Object detection of underwater optical images is of great significance in many underwater missions, such as the salvage of underwater objects, the exploration of marine organisms, etc. However, underwater objects are often small and dense, which are difficult to detect. To tackle above issues, we propose a novel framework of underwater object detection named Concatenate and Shuffle Network (CSNet) based on center points detection, which can not only detect small and dense objects with high accuracy, but also detect in real time. Firstly, a multi-scale fusion strategy called Feature Concatenation Shuffle (FCS) is proposed. The detailed features from shallow layer in Convolutional Neural Network are completely integrated into deep layer, and the capability for extracting features of small objects is enhanced. Moreover, to accelerate our method, we propose a lightweight deconvolution block (DB), which integrates a structure of dual-branch feature fusion and a lightweight deconvolution method. In addition, we study the advantages of detecting dense objects based on center points and introduce it to our detector. Lastly, experiments show that CSNet achieves the best speed-accuracy trade-off on URPC 2018 with 39.7% AP at 58.8 FPS and 42.4% AP with multi-scale testing at 5.7 FPS. Compared with several state-of-the-art detectors, CSNet reaches a competitive accuracy at a breakthrough speed and can run in real time under various computing conditions.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Zhou, X., Wang, D., Krähenbühl, P.: Objects as points. arXiv preprint arXiv:1904.07850 (2019)
Chollet, F.: Xception: deep learning with depthwise separable convolutions. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1251–1258 (2017)
Li, X., Shang, M., Qin, H., et al.: Fast accurate fish detection and recognition of underwater images with fast R-CNN. In: OCEANS 2015-MTS/IEEE Washington. IEEE, pp. 1–5 (2015)
Girshick, R.: Fast R-CNN. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1440–1448 (2015)
Li, X., Shang, M., Hao, J., et al.: Accelerating fish detection and recognition by sharing CNNs with objectness learning. In: OCEANS 2016-Shanghai. IEEE, pp. 1–5 (2016)
Ren, S., He, K., Girshick, R., et al.: Faster r-cnn: towards real-time object detection with region proposal networks. arXiv preprint arXiv:1506.01497 (2015)
Chen, X., Yu, J., Kong, S., et al.: Joint anchor-feature refinement for real-time accurate object detection in images and videos. IEEE Trans. Circuits Syst. Video Technol. 31(2), 594–607
Han, F., Yao, J., Zhu, H., Wang, C.: Underwater image processing and object detection based on deep CNN Method. J. Sens. 2020, 1–20 (2020)
Redmon, J., Divvala, S., Girshick, R., et al.: You only look once: unified, real-time object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 779–788 (2016)
Redmon, J., Farhadi, A.: YOLO9000: better, faster, stronger. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7263–7271 (2017)
Redmon, J., Farhadi, A.: Yolov3: an incremental improvement. arXiv preprint arXiv:1804.02767 (2018)
Liu, W., Anguelov, D., Erhan, D., et al.: Ssd: Single Shot Multibox Detector, pp. 21–37. European conference on computer vision. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46448-0_2
Lin, T.Y., Goyal, P., Girshick, R., et al.: Focal loss for dense object detection.In: Proceedings of the IEEE International Conference On Computer Vision, pp. 2980–2988 (2017)
Law, H., Deng, J.: Cornernet: detecting objects as paired keypoints. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) Computer Vision – ECCV 2018: 15th European Conference, Munich, Germany, September 8–14, 2018, Proceedings, Part XIV, pp. 765–781. Springer International Publishing, Cham (2018). https://doi.org/10.1007/978-3-030-01264-9_45
Zhou, X., Zhuo, J., Krahenbuhl, P.: Bottom-up object detection by grouping extreme and center points. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 850–859 (2019)
Tian, Z., Shen, C., Chen, H., et al.: Fcos: fully convolutional one-stage object detection. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, 9627–9636 (2019)
Neubeck, A., Van Gool, L.: Efficient non-maximum suppression. In: 18th International Conference on Pattern Recognition (ICPR 2006), vol. 3, pp. 850–855. IEEE, HongKong (2006)
Lin, T. Y., Dollár, P., Girshick, R., et al.: Feature pyramid networks for object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern recognition, pp. 2117–2125 (2017)
Chen, L., Liu, Z., Tong, L., et al.: Underwater object detection using invert multi-class Adaboost with deep learning. In: 2020 International Joint Conference on Neural Networks (IJCNN). IEEE, pp. 1–8 (2020)
Zhang, X., Zhou, X., Lin, M., et al.: Shufflenet: an extremely efficient convolutional neural network for mobile devices. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 6848–6856 (2018)
He, K., Zhang, X., Ren, S., et al.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)
Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Piotr Dollár, C., Zitnick, L.: Microsoft coco: common objects in context. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8693, pp. 740–755. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10602-1_48
Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9912, pp. 483–499. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46484-8_29
Zhang, S., Chi, C., Yao, Y., et al.: Bridging the gap between anchor-based and anchor-free detection via adaptive training sample selection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 9759–9768 (2020)
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Jiang, X., Mao, Z., Shen, J. (2022). Concatenate and Shuffle Network: A Real-Time Underwater Object Detector for Small and Dense Objects. In: Wu, M., Niu, Y., Gu, M., Cheng, J. (eds) Proceedings of 2021 International Conference on Autonomous Unmanned Systems (ICAUS 2021). ICAUS 2021. Lecture Notes in Electrical Engineering, vol 861. Springer, Singapore. https://doi.org/10.1007/978-981-16-9492-9_64
Download citation
DOI: https://doi.org/10.1007/978-981-16-9492-9_64
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-16-9491-2
Online ISBN: 978-981-16-9492-9
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)