Skip to main content

MIVIABot: A Cognitive Robot for Smart Museum

  • Conference paper
  • First Online:
Computer Analysis of Images and Patterns (CAIP 2019)

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 11678))

Included in the following conference series:

Abstract

Cognitive robots are robots provided with artificial intelligence capabilities, able to properly interact with people and with the objects in an a priori unknown environment, using advanced artificial intelligence algorithms. For instance, a humanoid robot can be perceived as a plausible tourist guide in a museum. Within this context, in this work we present how the latest findings in the field of machine learning and pattern recognition can be applied to equip a robot with sufficiently advanced perception capabilities in order to successfully guide visitors through the halls and the attraction in a museum.

The challenge of running all those algorithms on a mobile, embedded platform in real time is tackled on an architectural level, where all the artificial intelligence features are tuned to run with a low computational burden and a Neural Network accelerator is included in the hardware setup. Improved robustness and predictable latency is obtained avoiding the use of cloud services in the system.

Our robot, that we call MIVIABot, is able to decode and understand speech as well as extract soft biometrics from its interlocutor such as age, gender and emotional status. The robot can integrate all those elements in a dialog, using basic Natural Language Processing capabilities.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 64.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 84.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. The M-AILABS Speech Dataset (2019). https://www.caito.de/2019/01/the-m-ailabs-speech-dataset/

  2. Mozilla common voice, Italian dataset (2019). https://voice.mozilla.org/it/datasets

  3. Voxforge, Italian dataset (2019). http://www.voxforge.org/it

  4. Amert, T., Otterness, N., Yang, M., Anderson, J.H., Smith, F.D.: GPU scheduling on the NVIDIA TX2: hidden details revealed. In: 2017 IEEE Real-Time Systems Symposium (RTSS), pp. 104–115. IEEE (2017)

    Google Scholar 

  5. Bruce, A., Nourbakhsh, I., Simmons, R.: The role of expressiveness and attention in human-robot interaction. In: Proceedings 2002 IEEE International Conference on Robotics and Automation (Cat. No. 02CH37292), vol. 4, pp. 4138–4142. IEEE (2002)

    Google Scholar 

  6. Collobert, R., Puhrsch, C., Synnaeve, G.: Wav2Letter: an end-to-end convnet-based speech recognition system. arXiv preprint arXiv:1609.03193 (2016)

  7. Duffy, B.R.: Anthropomorphism and the social robot. Robot. Auton. Syst. 42(3–4), 177–190 (2003)

    Article  Google Scholar 

  8. Flacco, F., Kröger, T., De Luca, A., Khatib, O.: A depth space approach to human-robot collision avoidance. In: 2012 IEEE International Conference on Robotics and Automation, pp. 338–345. IEEE (2012)

    Google Scholar 

  9. Foggia, P., Greco, A., Percannella, G., Vento, M., Vigilante, V.: A system for gender recognition on mobile robots. In: Proceedings of the 2019 on Applications of Intelligent Systems (APPIS). ACM (2019)

    Google Scholar 

  10. Fulgenzi, C., Spalanzani, A., Laugier, C.: Dynamic obstacle avoidance in uncertain environment combining PVOs and occupancy grid. In: Proceedings 2007 IEEE International Conference on Robotics and Automation, pp. 1610–1616. IEEE (2007)

    Google Scholar 

  11. Graves, A., Fernández, S., Gomez, F., Schmidhuber, J.: Connectionist temporal classification: labelling unsegmented sequence data with recurrent neural networks. In: Proceedings of the 23rd International Conference on Machine Learning, pp. 369–376. ACM (2006)

    Google Scholar 

  12. Hannun, A.Y., Maas, A.L., Jurafsky, D., Ng, A.Y.: First-pass large vocabulary continuous speech recognition using bi-directional recurrent DNNs. arXiv preprint arXiv:1408.2873 (2014)

  13. Huang, G.B., Mattar, M., Berg, T., Learned-Miller, E.: Labeled faces in the wild: a database for studying face recognition in unconstrained environments. In: Workshop on Faces in ‘Real-Life’ Images: Detection, Alignment, and Recognition (2008)

    Google Scholar 

  14. Kemp, C.C., Edsinger, A., Torres-Jara, E.: Challenges for robot manipulation in human environments [grand challenges of robotics]. IEEE Robot. Autom. Mag. 14(1), 20–29 (2007)

    Article  Google Scholar 

  15. Kunze, J., Kirsch, L., Kurenkov, I., Krug, A., Johannsmeier, J., Stober, S.: Transfer learning for speech recognition on a budget. arXiv preprint arXiv:1706.00290 (2017)

  16. Liao, S., Jain, A.K., Li, S.Z.: A fast and accurate unconstrained face detector. IEEE Trans. Pattern Anal. Mach. Intell. 38(2), 211–223 (2016)

    Article  Google Scholar 

  17. Mollahosseini, A., Hasani, B., Mahoor, M.H.: AffectNet: a database for facial expression, valence, and arousal computing in the wild. IEEE Trans. Affect. Comput. 10, 18–31 (2017)

    Article  Google Scholar 

  18. Mori, M., MacDorman, K.F., Kageki, N.: The uncanny valley [from the field]. IEEE Robot. Autom. Mag. 19(2), 98–100 (2012)

    Article  Google Scholar 

  19. Viola, P., Jones, M.: Rapid object detection using a boosted cascade of simple features. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR). IEEE (2001)

    Google Scholar 

  20. Parkhi, O.M., Vedaldi, A., Zisserman, A., et al.: Deep face recognition. In: BMVC (2015)

    Google Scholar 

  21. Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.C.: MobileNetV2: inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4510–4520 (2018)

    Google Scholar 

  22. Savchenko, A.V.: Efficient facial representations for age, gender and identity recognition in organizing photo albums using multi-output CNN. arXiv preprint arXiv:1807.07718 (2018)

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Vincenzo Vigilante .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Saggese, A., Vento, M., Vigilante, V. (2019). MIVIABot: A Cognitive Robot for Smart Museum. In: Vento, M., Percannella, G. (eds) Computer Analysis of Images and Patterns. CAIP 2019. Lecture Notes in Computer Science(), vol 11678. Springer, Cham. https://doi.org/10.1007/978-3-030-29888-3_2

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-29888-3_2

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-29887-6

  • Online ISBN: 978-3-030-29888-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics