Automatic Estimation of Presentation Skills Using Speech, Slides and Gestures

Hanani, Abualsoud; Al-Amleh, Mohammad; Bazbus, Waseem; Salameh, Saleem

doi:10.1007/978-3-319-66429-3_17

Abualsoud Hanani¹⁶,
Mohammad Al-Amleh¹⁶,
Waseem Bazbus¹⁶ &
…
Saleem Salameh¹⁶

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 10458))

Included in the following conference series:

International Conference on Speech and Computer

2385 Accesses
2 Citations

Abstract

This paper proposes an automatic system which uses multimodal techniques for automatically estimating oral presentation skills. It is based on a set of features from three sources; audio, gesture and power-point slides. Machine learning techniques are used to classify each presentation into two classes (high vs. low) and into three classes; low, average, and high-quality presentation. Around 448 Multimodal recordings of the MLA’14 dataset were used for training and evaluating three different 2-class and 3-class classifiers. Classifiers were evaluated for each feature type independently and for all features combined together. The best accuracy of the 2-class systems is 90.1% achieved by SVM trained on audio features and 75% for 3-class systems achieved by random forest trained on slides features. Combining three feature types into one vector improves all systems accuracy by around 5%.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
http://www.ifs.tuwien.ac.at/mir/downloads.html.

References

Mohammadi, G., Vinciarelli, A.: Humans as feature extractors: combining prosody and personality perception for improved speaking style recognition. In: Proceedings of the IEEE International Conference on Systems, Man and Cybernetics, Anchorage, Alaska, USA, 9–12 October 2011, pp. 363–366 (2011)
Google Scholar
Hincks, R..: Processing the prosody of oral presentations. In: InSTIL/ICALL Symposium (2004)
Google Scholar
Shan-Wen, H., Sun, H.C., Hsieh, M.C., Tsai, M.H., Lin, H.C., Lee, C.C.: A multimodal approach for automatic assessment of school principals’ oral presentation during pre-service training program. In: Sixteenth Annual Conference of the International Speech Communication Association (2015)
Google Scholar
Gorovoy, K., Tung, J., Poupart, P.: Automatic speech feature extraction for cognitive load classification. In: Conference of the Canadian Medical and Biological Engineering Society (CMBEC) (2010)
Google Scholar
Luzardo, G., Guaman B., Chiluiza, K., Castells, J., Ochoa, X.: Estimation of presentations skills based on slides and audio features. In: Proceedings of the 2014 ACM Workshop on Multimodal Learning Analytics Workshop and Grand Challenge, MLA 2014, pp. 37–44. ACM, New York (2014)
Google Scholar
Kim, S., Jung, W., Han, K., Lee, J.-G., Yi, Mun Y.: Quality-based automatic classification for presentation slides. In: Rijke, M., Kenter, T., Vries, A.P., Zhai, C., Jong, F., Radinsky, K., Hofmann, K. (eds.) ECIR 2014. LNCS, vol. 8416, pp. 638–643. Springer, Cham (2014). doi:10.1007/978-3-319-06028-6_69
Chapter Google Scholar
Jamalian, B., Tversky, T.: Gestures alter thinking about time. In: Proceedings of the 34th Annual Conference of the Cognitive Science Society (CogSci) (2012)
Google Scholar
Carney, A.J.C., Dana, R., Yap, A.J.: Power posing: brief nonverbal displays affect neuroendocrine levels and risk tolerance. Psychol. Sci. 21(10), 1363–1368 (2010)
Article Google Scholar
Burgoon, J.K., Jensen, M.L., Meservy, T.O., Kruse, J., Nunamaker, J.F.: Augmenting human identification of emotional states in video. In: Intelligence Analysis Conference, McClean, VA (2005)
Google Scholar
Kopf, S., Guthier, B., Rietsche, R., Effelsberg, W., Schon, D.: A real-time feedback system for presentation skills. In: MLA 2014, Mannheim, Germany, pp. 1633–1640 (2015)
Google Scholar
Fang, Z., Zhang, G., Song, Z.: Comparison of different implementations of MFCC. J. Comput. Sci. Technol. 16(6), 582–589 (2001)
Article MATH Google Scholar
Talkin, D.: A robust algorithm for pitch tracking (RAPT). In: Kleijn, W., Paliwal, K. (eds.) Speech Coding and Synthesis, pp. 495–518. Elsevier, New York (1995)
Google Scholar
Paul, B., David, W.: PRAAT: doing phonetics by computer [Computer program]. Version 6.0.26. http://www.praat.org/. Accessed 2 Mar 2017
Jong, D., Nivja, H., Wempe, T.: Praat script to detect syllable nuclei and measure speech rate automatically. Behav. Res. Methods 41(2), 385–390 (2009)
Article Google Scholar

Download references

Author information

Authors and Affiliations

Birzeit University, Birzeit, Palestine
Abualsoud Hanani, Mohammad Al-Amleh, Waseem Bazbus & Saleem Salameh

Authors

Abualsoud Hanani
View author publications
You can also search for this author in PubMed Google Scholar
Mohammad Al-Amleh
View author publications
You can also search for this author in PubMed Google Scholar
Waseem Bazbus
View author publications
You can also search for this author in PubMed Google Scholar
Saleem Salameh
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Abualsoud Hanani .

Editor information

Editors and Affiliations

SPIIRAS, Saint Petersburg, Russia
Alexey Karpov
Moscow State Linguistic University, Moscow, Russia
Rodmonga Potapova
University of Hertfordshire, Hatfield, United Kingdom
Iosif Mporas

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Hanani, A., Al-Amleh, M., Bazbus, W., Salameh, S. (2017). Automatic Estimation of Presentation Skills Using Speech, Slides and Gestures. In: Karpov, A., Potapova, R., Mporas, I. (eds) Speech and Computer. SPECOM 2017. Lecture Notes in Computer Science(), vol 10458. Springer, Cham. https://doi.org/10.1007/978-3-319-66429-3_17

Download citation

DOI: https://doi.org/10.1007/978-3-319-66429-3_17
Published: 13 August 2017
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-66428-6
Online ISBN: 978-3-319-66429-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics