Abstract
For the interaction between marine robots and divers in the underwater environment, a method of diver’s gesture recognition and segmentation is proposed. This method first uses the progressive growing training method to optimize the generative adversarial networks, generating high-resolution images with complex content. Then, we use the generative adversarial network model as a data augmentation method and generate high-resolution images. We make the masks of gestures in the new dataset and use the mask R-CNN algorithm for gesture recognition and gesture segmentation. The experimental results show that the generating data improves the accuracy of several object recognition algorithms but cannot completely replace the original data and the mean average precision of gesture recognition is 0.85. The visualization shows the validity and weakness of segmentation.
Similar content being viewed by others
References
Wu, N., Wang, M., Ge, T., et al.: Experiments on high-performance maneuvers control for a work-class 3000-m remote operated vehicle. Proc. Inst. Mech. Eng. 233(5), 558–569 (2019)
Kim, Y.J., Kim, H.T., Cho, Y.J., et al.: Development of a power control system for AUVs probing for underwater mineral resources. J. Mar. Sci. Appl. 8(4), 259 (2009)
Palomeras, N., Vallicrosa, G., Mallios, A., Bosch, J., Vidal, E., Hurtos, N., et al.: AUV homing and docking for remote operations. Ocean Eng. 154, 106–120 (2018)
Flammang, B.E., Tangorra, J.L., Mignano, A.P., et al.: Building a fish: the biology and engineering behind a bioinspired autonomous underwater vehicle. Mar. Technol. Soc. J. 51(5), 15–22 (2017)
Zhang, B., Wang, Y., Wang, H., et al.: Tracking a duty-cycled autonomous underwater vehicle by underwater wireless sensor networks. IEEE Access 5, 18016–18032 (2017)
Girshick, R., Donahue, J., Darrell, T., et al.: Rich feature hierarchies for accurate object detection and semantic segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 580–587 (2014)
Ren, S., He, K., Girshick, R., et al.: Faster r-cnn: Towards real-time object detection with region proposal networks. In: Advances in neural information processing systems, pp. 91–99 (2015)
Girshick, R.: Fast r-cnn. In: Proceedings of the IEEE international conference on computer vision, pp. 1440–1448 (2015)
Li, X., Liang, Y., Zhao, M., Wang, C., Bai, H., Jiang, Y.: Simulation of evacuating crowd based on deep learning and social force model. IEEE Access 7, 155361–155371 (2019)
Redmon, J., Divvala, S., Girshick, R., et al.: In: You only look once: Unified, real-time object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 779–788 (2016)
Redmon, J., Farhadi, A.: Yolov3: An incremental improvement. arXiv preprint arXiv:1804.02767 (2018)
Bochkovskiy, A., Wang, C.Y., Liao, H.Y.M.: YOLOv4: optimal speed and accuracy of object detection. (2020) arXiv preprint arXiv:2004.10934
Liu, W., Anguelov, D., Erhan, D., et al.: In: Ssd: single shot multibox detectoreuropean conference on computer vision, pp. 21–37 (2016)
Liu, Y., Wang, X., Zhai, Z., Chen, R., Zhang, B., Jiang, Y.: Timely daily activity recognition from headmost sensor events. ISA Trans. 94, 379–390 (2019)
Erden, F., Cetin, A.E.: Hand gesture based remote control system using infrared sensors and a camera. IEEE Trans. Consum. Electron. 60(4), 675–680 (2015)
Cao, Z., Hidalgo, G., Simon, T., et al.: OpenPose: realtime multi-person 2D pose estimation using part affinity fields. IEEE Trans. Pattern Anal. Mach. Intell. 43(1), 172–186 (2019). https://doi.org/10.1109/TPAMI.2019.2929257
Liu J., Liu Y., Wang Y., et al.: Decoupled representation learning for skeleton-based gesture recognition. In: IEEE conference on computer vision and pattern recognition (2020)
Yang J., Wilson J. P., Gupta S.: Diver gesture recognition using deep learning for underwater human–robot interaction, OCEANS 2019 MTS/IEEE SEATTLE (2019)
Jiang, Y., Peng, X., Xue, M., et al.: An underwater human–robot interaction using hand gestures for fuzzy control. Int. J. Fuzzy Syst. 3, 1–11 (2020)
Mišković, N., Pascoal, A., Bibuli, M., Caccia, M., Neasham, J., A., Birk, A., et al.: CADDY project, year 3: the final validation trials. In: Oceans 2017-aberdeen, pp. 1–5 (2017)
Stilinovic, N., Nad, D., Mišković, N.: AUV for diver assistance and safety–design and implementation. In: Oceans 2015-Genova. IEEE, pp. 1–4 (2015)
Gomez Chavez, A., Ranieri, A., Chiarella, D., et al.: CADDY underwater stereo-vision dataset for human–robot interaction (HRI) in the context of diver activities. J. Mar. Sci. Eng. 7(1), 16 (2019)
Odetti, A., Bibuli, M., Bruzzone, G., et al.: e-URoPe: a reconfgurable AUV/ROV for man–robot underwater cooperation. IFAC-PapersOnLine 50(1), 11203–11208 (2017)
Chiarella, D., Bibuli, M., Bruzzone, G., et al.: A novel gesture-based language for underwater human–robot interaction. J. Mar. Sci. Eng. 6(3), 91 (2018)
Chiarella, D., Bibuli, M., Bruzzone, G., Caccia, M., Ranieri, A., Zereik, E., et al.: (2015). Gesture-based language for diver–robot underwater interaction. In Oceans 2015-genova, pp. 1–9 (2015)
Radford, A., Metz, L., Chintala, S.: Unsupervised representation learning with deep convolutional generative adversarial networks. arXiv:1511.06434 (2015)
Karras, T., Aila, T., Laine, S., et al.: Progressive growing of gans for improved quality, stability, and variation. arXiv:1710.10196 (2017)
Vuola, A.O., Akram, S.U., Kannala, J.: Mask-RCNN and U-net ensembled for nuclei segmentation. In: IEEE 16th international symposium on biomedical imaging (ISBI 2019), pp. 208–212 (2019)
Acknowledgements
This work was supported by the National Natural Science Foundation of China under Grant 62072211, Grant 51809112, and Grant 51679105. This work was also supported by the Science-Technology Development Plan Project of Jilin Province of China through Grants 20170101081JC, 20190303006SF and 20190302107GX.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Jiang, Y., Zhao, M., Wang, C. et al. Diver’s hand gesture recognition and segmentation for human–robot interaction on AUV. SIViP 15, 1899–1906 (2021). https://doi.org/10.1007/s11760-021-01930-5
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11760-021-01930-5