Abstract
This research proposes an idea of motion-enhanced display that utilizes the display itself as the communication media which mimics the motion of human head to enhance presence in remote communication. The idea has been implemented as an augmented tele-presence system called ARM-COMS (ARm-supported eMbodied COm-munication Monitor System). Basically, ARM-COMS detects the orientation of a subject face by the face-detection tool based on an image processing technique, and mimics the head motion of a remote partner in an effective manner. In addition to that, ARM-COMS makes appropriate reactions when a communication partner speaks even without any significant motion in video communication by using audio signal during talk.
This paper covers two topics. The first one is a new design of the ARM-COMS robotic arm, with the configuration of six-axis servo motors to enable further smooth motion. In addition to the hardware configuration, the software configuration is also presented based on the ROS framework.
The second topic is a camera stabilizer-based experimental configuration. This study worked on a feasibility study of experimental configuration of ARM-COMS. If it works feasible, the approach could be applied to the redesigned ARM-COMS robotic arm system.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Android Studio. https://developer.android.com/studio/index.html?hl=ja. Accessed 28 February 2020
Bertrand, C., Bourdeau, L.: Research interviews by Skype: a new datacollection method. In: Esteves, J. (Ed.), Proceedings from the 9th European Conferenceon Re-search Methods, pp. 70–79. Spain: IE Business School (2010)
DJI (Da-Jiang Innovations Science and Technology). https://developer.dji.com/mobile-sdk/. Accessed 24 February 2020
Ekman, P., Friesen, W.V.: The repertoire or nonverbal behavior: categories, origins, usage, and coding. Semiotica 1, 49–98 (1969)
FASTRK. http://polhemus.com/motion-tracking/all-trackers/fastrak
Gerkey, B., Smart, W., Quigley, M.: Programming Robots with ROS. O’Reilly Media, Sebastopol (2015)
Ito, T., Watanabe, T.: Motion control algorithm of ARM-COMS for entrainment enhancement. In: Yamamoto, S. (ed.) HIMI 2016. LNCS, vol. 9734, pp. 339–346. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-40349-6_32
JDK. http://www.oracle.com/technetwork/java/javase/downloads/jdk8-downloads-2133151.html. Accessed 28 February 2020
Krafka, K., et al.: Eye tracking for everyone. In: IEEE Conference on Computer Vision and Pat-tern Recognition (CVPR) (2016)
Lee, A., Kawahara, T.: Recent development of open-source speech recognition engine julius. Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC) (2009)
Light, R.: Mosquitto: server and client implementation of the MQTT protocol. J. Open Source Softw. 2(13), 265 (2017). https://doi.org/10.21105/joss
Mehrabian, A., Williams, M.: Nonverbal concomitants of perceived and intended persuasiveness. J. Pers. Soc. Psychol. 13(1), 37–58 (1969). https://doi.org/10.1037/h0027993
Stephen, J.: Understanding body language: birdwhistell’s theory of kinesics. Corp. Commun. Int. J. ( (2000). https://doi.org/10.1108/13563280010377518)
Schoff, F., Kalenichenko, D., Philbin, J.: FaceNet: a unified embedding for face recognition and clustering. IEEE Conf. CVPR 2015, 815–823 (2015)
Society 5.0. https://www.japan.go.jp/abenomics/_userdata/abenomics/pdf/society_5.0.pdf. Accessed 28 February 2020
Watanabe, T.: Human-Entrained Embodied Interaction and Communication Technology, pp. 161–177. Emotional Engineering, Springer (2011)
Watanabe, T.: InterRobot: speech-driven embodied interaction robot. J. Robot. Soc. Jap. 26(6), 692–695 (2006)
W3C Specification. https://wicg.github.io/speech-api/. Accessed 24 February 2020
Web Speech API. https://developer.mozilla.org/en-US/docs/Web/API/Web_Speech_API. Accessed 24 February 2020
Acknowledgement
This work was partly supported by JSPS KAKENHI Grant Numbers JP19K12082 and Original Research Grant 2019 of Okayama Prefectural University. The author would like to acknowledge Risa Tanaka and all members of Kansei Information Engineering Labs at Okayama Prefectural University for their cooperation to conduct the experiments.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Ito, T., Oyama, T., Watanabe, T. (2020). Speech Recognition Approach for Motion-Enhanced Display in ARM-COMS System. In: Stephanidis, C., Kurosu, M., Degen, H., Reinerman-Jones, L. (eds) HCI International 2020 - Late Breaking Papers: Multimodality and Intelligence. HCII 2020. Lecture Notes in Computer Science(), vol 12424. Springer, Cham. https://doi.org/10.1007/978-3-030-60117-1_10
Download citation
DOI: https://doi.org/10.1007/978-3-030-60117-1_10
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-60116-4
Online ISBN: 978-3-030-60117-1
eBook Packages: Computer ScienceComputer Science (R0)