MIVIABot: A Cognitive Robot for Smart Museum

Saggese, Alessia; Vento, Mario; Vigilante, Vincenzo

doi:10.1007/978-3-030-29888-3_2

Alessia Saggese¹⁰,
Mario Vento¹⁰ &
Vincenzo Vigilante¹⁰

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 11678))

Included in the following conference series:

International Conference on Computer Analysis of Images and Patterns

1820 Accesses
6 Citations

Abstract

Cognitive robots are robots provided with artificial intelligence capabilities, able to properly interact with people and with the objects in an a priori unknown environment, using advanced artificial intelligence algorithms. For instance, a humanoid robot can be perceived as a plausible tourist guide in a museum. Within this context, in this work we present how the latest findings in the field of machine learning and pattern recognition can be applied to equip a robot with sufficiently advanced perception capabilities in order to successfully guide visitors through the halls and the attraction in a museum.

The challenge of running all those algorithms on a mobile, embedded platform in real time is tackled on an architectural level, where all the artificial intelligence features are tuned to run with a low computational burden and a Neural Network accelerator is included in the hardware setup. Improved robustness and predictable latency is obtained avoiding the use of cloud services in the system.

Our robot, that we call MIVIABot, is able to decode and understand speech as well as extract soft biometrics from its interlocutor such as age, gender and emotional status. The robot can integrate all those elements in a dialog, using basic Natural Language Processing capabilities.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 64.99; Price excludes VAT (USA)

Softcover Book: USD 84.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

The M-AILABS Speech Dataset (2019). https://www.caito.de/2019/01/the-m-ailabs-speech-dataset/
Mozilla common voice, Italian dataset (2019). https://voice.mozilla.org/it/datasets
Voxforge, Italian dataset (2019). http://www.voxforge.org/it
Amert, T., Otterness, N., Yang, M., Anderson, J.H., Smith, F.D.: GPU scheduling on the NVIDIA TX2: hidden details revealed. In: 2017 IEEE Real-Time Systems Symposium (RTSS), pp. 104–115. IEEE (2017)
Google Scholar
Bruce, A., Nourbakhsh, I., Simmons, R.: The role of expressiveness and attention in human-robot interaction. In: Proceedings 2002 IEEE International Conference on Robotics and Automation (Cat. No. 02CH37292), vol. 4, pp. 4138–4142. IEEE (2002)
Google Scholar
Collobert, R., Puhrsch, C., Synnaeve, G.: Wav2Letter: an end-to-end convnet-based speech recognition system. arXiv preprint arXiv:1609.03193 (2016)
Duffy, B.R.: Anthropomorphism and the social robot. Robot. Auton. Syst. 42(3–4), 177–190 (2003)
Article Google Scholar
Flacco, F., Kröger, T., De Luca, A., Khatib, O.: A depth space approach to human-robot collision avoidance. In: 2012 IEEE International Conference on Robotics and Automation, pp. 338–345. IEEE (2012)
Google Scholar
Foggia, P., Greco, A., Percannella, G., Vento, M., Vigilante, V.: A system for gender recognition on mobile robots. In: Proceedings of the 2019 on Applications of Intelligent Systems (APPIS). ACM (2019)
Google Scholar
Fulgenzi, C., Spalanzani, A., Laugier, C.: Dynamic obstacle avoidance in uncertain environment combining PVOs and occupancy grid. In: Proceedings 2007 IEEE International Conference on Robotics and Automation, pp. 1610–1616. IEEE (2007)
Google Scholar
Graves, A., Fernández, S., Gomez, F., Schmidhuber, J.: Connectionist temporal classification: labelling unsegmented sequence data with recurrent neural networks. In: Proceedings of the 23rd International Conference on Machine Learning, pp. 369–376. ACM (2006)
Google Scholar
Hannun, A.Y., Maas, A.L., Jurafsky, D., Ng, A.Y.: First-pass large vocabulary continuous speech recognition using bi-directional recurrent DNNs. arXiv preprint arXiv:1408.2873 (2014)
Huang, G.B., Mattar, M., Berg, T., Learned-Miller, E.: Labeled faces in the wild: a database for studying face recognition in unconstrained environments. In: Workshop on Faces in ‘Real-Life’ Images: Detection, Alignment, and Recognition (2008)
Google Scholar
Kemp, C.C., Edsinger, A., Torres-Jara, E.: Challenges for robot manipulation in human environments [grand challenges of robotics]. IEEE Robot. Autom. Mag. 14(1), 20–29 (2007)
Article Google Scholar
Kunze, J., Kirsch, L., Kurenkov, I., Krug, A., Johannsmeier, J., Stober, S.: Transfer learning for speech recognition on a budget. arXiv preprint arXiv:1706.00290 (2017)
Liao, S., Jain, A.K., Li, S.Z.: A fast and accurate unconstrained face detector. IEEE Trans. Pattern Anal. Mach. Intell. 38(2), 211–223 (2016)
Article Google Scholar
Mollahosseini, A., Hasani, B., Mahoor, M.H.: AffectNet: a database for facial expression, valence, and arousal computing in the wild. IEEE Trans. Affect. Comput. 10, 18–31 (2017)
Article Google Scholar
Mori, M., MacDorman, K.F., Kageki, N.: The uncanny valley [from the field]. IEEE Robot. Autom. Mag. 19(2), 98–100 (2012)
Article Google Scholar
Viola, P., Jones, M.: Rapid object detection using a boosted cascade of simple features. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR). IEEE (2001)
Google Scholar
Parkhi, O.M., Vedaldi, A., Zisserman, A., et al.: Deep face recognition. In: BMVC (2015)
Google Scholar
Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.C.: MobileNetV2: inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4510–4520 (2018)
Google Scholar
Savchenko, A.V.: Efficient facial representations for age, gender and identity recognition in organizing photo albums using multi-output CNN. arXiv preprint arXiv:1807.07718 (2018)

Download references

Author information

Authors and Affiliations

Università Degli Studi Di Salerno, 84084, Salerno, SA, Italy
Alessia Saggese, Mario Vento & Vincenzo Vigilante

Authors

Alessia Saggese
View author publications
You can also search for this author in PubMed Google Scholar
Mario Vento
View author publications
You can also search for this author in PubMed Google Scholar
Vincenzo Vigilante
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Vincenzo Vigilante .

Editor information

Editors and Affiliations

Department of Computer and Electrical Engineering and Applied Mathematics, University of Salerno, Fisciano (SA), Italy
Mario Vento
Department of Computer and Electrical Engineering and Applied Mathematics, University of Salerno, Fisciano (SA), Italy
Gennaro Percannella

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Saggese, A., Vento, M., Vigilante, V. (2019). MIVIABot: A Cognitive Robot for Smart Museum. In: Vento, M., Percannella, G. (eds) Computer Analysis of Images and Patterns. CAIP 2019. Lecture Notes in Computer Science(), vol 11678. Springer, Cham. https://doi.org/10.1007/978-3-030-29888-3_2

Download citation

DOI: https://doi.org/10.1007/978-3-030-29888-3_2
Published: 22 August 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-29887-6
Online ISBN: 978-3-030-29888-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics