MIFTel: a multimodal interactive framework based on temporal logic rules

Avola, Danilo; Cinque, Luigi; Del Bimbo, Alberto; Marini, Marco Raoul

doi:10.1007/s11042-019-08590-1

MIFTel: a multimodal interactive framework based on temporal logic rules

Published: 31 January 2020

Volume 79, pages 13533–13558, (2020)
Cite this article

Multimedia Tools and Applications Aims and scope Submit manuscript

Danilo Avola¹,
Luigi Cinque¹,
Alberto Del Bimbo² &
…
Marco Raoul Marini ORCID: orcid.org/0000-0002-2540-2570¹

214 Accesses
5 Citations
Explore all metrics

Abstract

Human-computer interfaces and multimodal interaction are increasingly used in everyday life. Environments equipped with sensors are able to acquire and interpret a wide range of information, thus assisting humans in several application areas, such as behaviour understanding, event detection, action recognition, and many others. In these areas, the suitable processing of this information is a key factor to properly structure multimodal data. In particular, heterogeneous devices and different acquisition times can be exploited to improve recognition results. On the basis of these assumptions, in this paper, a multimodal system based on Allen’s temporal logic combined with a prevision method is proposed. The main target of the system is to correlate user’s events with system’s reactions. After the post-processing data coming from different acquisition devices (e.g., RGB images, depth maps, sounds, proximity sensors), the system manages the correlations between recognition/detection results and events, in real-time, thus creating an interactive environment for users. To increase the recognition reliability, a predictive model is also associated with the method. Modularity of the system grants a full dynamic development and upgrade with customized modules. Finally, comparisons with other similar systems are shown, thus underlining the high flexibility and robustness of the proposed event management method.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Management of Multimodal User Interaction in Companion-Systems

Towards an Ambient Intelligent Environment for Multimodal Human Computer Interactions

Context-Based Fusion of Physical and Human Data for Level 5 Information Fusion

References

Allen JF (1983) Maintaining knowledge about temporal intervals. Commun ACM 26(11):832–843
Article Google Scholar
Avola D, Cinque L, Foresti GL, Massaroni C, Pannone D (2017) A keypoint-based method for background modeling and foreground detection using a ptz camera. Pattern Recogn Lett 96:96–105
Article Google Scholar
Avola D, Cinque L, Foresti GL, Marini MR, Pannone D (2018) Vrheab: a fully immersive motor rehabilitation system based on recurrent neural network. Multimed Tools Appl 77(19):24, 955–24, 982
Article Google Scholar
Avola D, Bernardi M, Cinque L, Foresti GL, Massaroni C (2019) Exploiting recurrent neural networks and leap motion controller for the recognition of sign language and semaphoric hand gestures. IEEE Trans Multimed 21(1):234–245
Article Google Scholar
Avola D, Bernardi M, Foresti GL (2019) Fusing depth and colour information for human action recognition. Multimed Tools Appl 78(5):5919–5939
Article Google Scholar
Avola D, Cinque L, Foresti GL, Marini MR (2019) An interactive and low-cost full body rehabilitation framework based on 3d immersive serious games. J Biomed Inform 89:81–100
Article Google Scholar
Bennett B, Cohn AG, Wolter F, Zakharyaschev M (2002) Multi-dimensional modal logic as a framework for spatio-temporal reasoning. Appl Intell 17(3):239–251
Article Google Scholar
Cheng G, Wan Y, Buckles BP, Huang Y (2014) An introduction to markov logic networks and application in video activity analysis. In: Proceedings of the international conference on computing, communication and networking technologies (ICCCNT), pp 1–7
Crispim-Junior CF, Buso V, Avgerinakis K, Meditskos G, Briassouli A, Benois-Pineau J, Kompatsiaris IY, Bremond F (2016) Semantic event fusion of different visual modality concepts for activity recognition. IEEE Trans Pattern Anal Mach Intell 38(8):1598–1611
Article Google Scholar
Fan H, Zheng L, Yan C, Yang Y (2018) Unsupervised person re-identification: clustering and fine-tuning. ACM Trans Multimed Comput Commun Appl 14(4):83: 1–83: 18
Article Google Scholar
Ghojogh B, Mohammadzade H, Mokari M (2018) Fisherposes for human action recognition using kinect sensor data. IEEE Sensors J 18(4):1612–1627
Article Google Scholar
Houmanfar R, Karg M, Kulić D (2016) Movement analysis of rehabilitation exercises: distance metrics for measuring patient progress. IEEE Syst J 10(3):1014–1025
Article Google Scholar
Jaimes A, Sebe N (2007) Multimodal human-computer interaction: a survey. Comput Vis Image Understand 108(1-2):116–134
Article Google Scholar
Kemeny JG, Snell JL (1960) Finite Markov Chains, University Series in Undergraduate Mathematics, vol 356. van Nostrand Princeton, NJ
Lalanne D, Nigay L, Robinson P, Vanderdonckt J, Ladry JF et al (2009) Fusion engines for multimodal input: a survey. In: Proceedings of the international conference on multimodal interfaces (ICMI), pp 153–160
Mehlmann GU, André E (2012) Modeling multimodal integration with event logic charts. In: Proceedings of the international conference on multimodal interaction, pp 125–132
Richardson M, Domingos P (2006) Markov logic networks. Machine learning 62(1-2):107–136
Article Google Scholar
Seide F, Agarwal A (2016) Cntk: Microsoft’s open-source deep-learning toolkit. In: Proceedings of the ACM SIGKDD international conference on knowledge discovery and data mining (KDD), pp 2135–2135
Song YC, Kautz H, Allen J, Swift M, Li Y, Luo J, Zhang C (2013) A markov logic framework for recognizing complex events from multimodal data. In: Proceedings of the international conference on multimodal interaction (ICMI), pp 141–148
Wan Y, Santiteerakul W, Cheng G, Buckles B, Parberry I (2013) A representation for human gesture recognition and beyond. In: International conference on computing, communications and networking technologies (ICCCNT), pp 1–6
Yu S, Cheng Y, Su S, Cai G, Li S (2017) Stratified pooling based deep convolutional neural networks for human action recognition. Multimed Tools Appl 76 (11):13, 367–13, 382
Article Google Scholar
Zeng W, Wang C, Wang Q (2018) Hand gesture recognition using leap motion via deterministic learning. Multimed Tools Appl 77(21):28, 185–28, 206
Article Google Scholar
Zhang Y, Ji Q, Lu H (2013) Event detection in complex scenes using interval temporal constraints. In: Proceedings of the international conference on computer vision (ICCV), pp 3184–3191

Download references

Acknowledgements

This work was supported in part by the MIUR under grant “Departments of Excellence 2018-2022” of the Department of Computer Science of Sapienza University.

Author information

Authors and Affiliations

Department of Computer Science, Sapienza University, Via Salaria 113, Rome, 00198, Italy
Danilo Avola, Luigi Cinque & Marco Raoul Marini
Department of Information Engineering, University of Florence, Via Santa Marta 3, Florence, 50139, Italy
Alberto Del Bimbo

Authors

Danilo Avola
View author publications
You can also search for this author in PubMed Google Scholar
Luigi Cinque
View author publications
You can also search for this author in PubMed Google Scholar
Alberto Del Bimbo
View author publications
You can also search for this author in PubMed Google Scholar
Marco Raoul Marini
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Danilo Avola.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Avola, D., Cinque, L., Del Bimbo, A. et al. MIFTel: a multimodal interactive framework based on temporal logic rules. Multimed Tools Appl 79, 13533–13558 (2020). https://doi.org/10.1007/s11042-019-08590-1

Download citation

Received: 08 March 2019
Revised: 06 December 2019
Accepted: 13 December 2019
Published: 31 January 2020
Issue Date: May 2020
DOI: https://doi.org/10.1007/s11042-019-08590-1

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

MIFTel: a multimodal interactive framework based on temporal logic rules

Abstract

Access this article

Similar content being viewed by others

Management of Multimodal User Interaction in Companion-Systems

Towards an Ambient Intelligent Environment for Multimodal Human Computer Interactions

Context-Based Fusion of Physical and Human Data for Level 5 Information Fusion

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

MIFTel: a multimodal interactive framework based on temporal logic rules

Abstract

Access this article

Similar content being viewed by others

Management of Multimodal User Interaction in Companion-Systems

Towards an Ambient Intelligent Environment for Multimodal Human Computer Interactions

Context-Based Fusion of Physical and Human Data for Level 5 Information Fusion

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation