Skip to main content

Advertisement

Log in

Diver’s hand gesture recognition and segmentation for human–robot interaction on AUV

  • Original Paper
  • Published:
Signal, Image and Video Processing Aims and scope Submit manuscript

Abstract

For the interaction between marine robots and divers in the underwater environment, a method of diver’s gesture recognition and segmentation is proposed. This method first uses the progressive growing training method to optimize the generative adversarial networks, generating high-resolution images with complex content. Then, we use the generative adversarial network model as a data augmentation method and generate high-resolution images. We make the masks of gestures in the new dataset and use the mask R-CNN algorithm for gesture recognition and gesture segmentation. The experimental results show that the generating data improves the accuracy of several object recognition algorithms but cannot completely replace the original data and the mean average precision of gesture recognition is 0.85. The visualization shows the validity and weakness of segmentation.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Similar content being viewed by others

References

  1. Wu, N., Wang, M., Ge, T., et al.: Experiments on high-performance maneuvers control for a work-class 3000-m remote operated vehicle. Proc. Inst. Mech. Eng. 233(5), 558–569 (2019)

    Google Scholar 

  2. Kim, Y.J., Kim, H.T., Cho, Y.J., et al.: Development of a power control system for AUVs probing for underwater mineral resources. J. Mar. Sci. Appl. 8(4), 259 (2009)

    Article  Google Scholar 

  3. Palomeras, N., Vallicrosa, G., Mallios, A., Bosch, J., Vidal, E., Hurtos, N., et al.: AUV homing and docking for remote operations. Ocean Eng. 154, 106–120 (2018)

    Article  Google Scholar 

  4. Flammang, B.E., Tangorra, J.L., Mignano, A.P., et al.: Building a fish: the biology and engineering behind a bioinspired autonomous underwater vehicle. Mar. Technol. Soc. J. 51(5), 15–22 (2017)

    Article  Google Scholar 

  5. Zhang, B., Wang, Y., Wang, H., et al.: Tracking a duty-cycled autonomous underwater vehicle by underwater wireless sensor networks. IEEE Access 5, 18016–18032 (2017)

    Article  Google Scholar 

  6. Girshick, R., Donahue, J., Darrell, T., et al.: Rich feature hierarchies for accurate object detection and semantic segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 580–587 (2014)

  7. Ren, S., He, K., Girshick, R., et al.: Faster r-cnn: Towards real-time object detection with region proposal networks. In: Advances in neural information processing systems, pp. 91–99 (2015)

  8. Girshick, R.: Fast r-cnn. In: Proceedings of the IEEE international conference on computer vision, pp. 1440–1448 (2015)

  9. Li, X., Liang, Y., Zhao, M., Wang, C., Bai, H., Jiang, Y.: Simulation of evacuating crowd based on deep learning and social force model. IEEE Access 7, 155361–155371 (2019)

    Article  Google Scholar 

  10. Redmon, J., Divvala, S., Girshick, R., et al.: In: You only look once: Unified, real-time object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 779–788 (2016)

  11. Redmon, J., Farhadi, A.: Yolov3: An incremental improvement. arXiv preprint arXiv:1804.02767 (2018)

  12. Bochkovskiy, A., Wang, C.Y., Liao, H.Y.M.: YOLOv4: optimal speed and accuracy of object detection. (2020) arXiv preprint arXiv:2004.10934

  13. Liu, W., Anguelov, D., Erhan, D., et al.: In: Ssd: single shot multibox detectoreuropean conference on computer vision, pp. 21–37 (2016)

  14. Liu, Y., Wang, X., Zhai, Z., Chen, R., Zhang, B., Jiang, Y.: Timely daily activity recognition from headmost sensor events. ISA Trans. 94, 379–390 (2019)

    Article  Google Scholar 

  15. Erden, F., Cetin, A.E.: Hand gesture based remote control system using infrared sensors and a camera. IEEE Trans. Consum. Electron. 60(4), 675–680 (2015)

    Article  Google Scholar 

  16. Cao, Z., Hidalgo, G., Simon, T., et al.: OpenPose: realtime multi-person 2D pose estimation using part affinity fields. IEEE Trans. Pattern Anal. Mach. Intell. 43(1), 172–186 (2019). https://doi.org/10.1109/TPAMI.2019.2929257

    Article  Google Scholar 

  17. Liu J., Liu Y., Wang Y., et al.: Decoupled representation learning for skeleton-based gesture recognition. In: IEEE conference on computer vision and pattern recognition (2020)

  18. Yang J., Wilson J. P., Gupta S.: Diver gesture recognition using deep learning for underwater human–robot interaction, OCEANS 2019 MTS/IEEE SEATTLE (2019)

  19. Jiang, Y., Peng, X., Xue, M., et al.: An underwater human–robot interaction using hand gestures for fuzzy control. Int. J. Fuzzy Syst. 3, 1–11 (2020)

    Google Scholar 

  20. Mišković, N., Pascoal, A., Bibuli, M., Caccia, M., Neasham, J., A., Birk, A., et al.: CADDY project, year 3: the final validation trials. In: Oceans 2017-aberdeen, pp. 1–5 (2017)

  21. Stilinovic, N., Nad, D., Mišković, N.: AUV for diver assistance and safety–design and implementation. In: Oceans 2015-Genova. IEEE, pp. 1–4 (2015)

  22. Gomez Chavez, A., Ranieri, A., Chiarella, D., et al.: CADDY underwater stereo-vision dataset for human–robot interaction (HRI) in the context of diver activities. J. Mar. Sci. Eng. 7(1), 16 (2019)

    Article  Google Scholar 

  23. Odetti, A., Bibuli, M., Bruzzone, G., et al.: e-URoPe: a reconfgurable AUV/ROV for man–robot underwater cooperation. IFAC-PapersOnLine 50(1), 11203–11208 (2017)

    Article  Google Scholar 

  24. Chiarella, D., Bibuli, M., Bruzzone, G., et al.: A novel gesture-based language for underwater human–robot interaction. J. Mar. Sci. Eng. 6(3), 91 (2018)

    Article  Google Scholar 

  25. Chiarella, D., Bibuli, M., Bruzzone, G., Caccia, M., Ranieri, A., Zereik, E., et al.: (2015). Gesture-based language for diver–robot underwater interaction. In Oceans 2015-genova, pp. 1–9 (2015)

  26. Radford, A., Metz, L., Chintala, S.: Unsupervised representation learning with deep convolutional generative adversarial networks. arXiv:1511.06434 (2015)

  27. Karras, T., Aila, T., Laine, S., et al.: Progressive growing of gans for improved quality, stability, and variation. arXiv:1710.10196 (2017)

  28. Vuola, A.O., Akram, S.U., Kannala, J.: Mask-RCNN and U-net ensembled for nuclei segmentation. In: IEEE 16th international symposium on biomedical imaging (ISBI 2019), pp. 208–212 (2019)

Download references

Acknowledgements

This work was supported by the National Natural Science Foundation of China under Grant 62072211, Grant 51809112, and Grant 51679105. This work was also supported by the Science-Technology Development Plan Project of Jilin Province of China through Grants 20170101081JC, 20190303006SF and 20190302107GX.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Hong Qi.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Jiang, Y., Zhao, M., Wang, C. et al. Diver’s hand gesture recognition and segmentation for human–robot interaction on AUV. SIViP 15, 1899–1906 (2021). https://doi.org/10.1007/s11760-021-01930-5

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11760-021-01930-5

Keywords

Navigation