Keywords

1 Introduction

Experiencing a great presenter delivering a novel idea is an inspiring event. Therefore, at least for the last 2500 years humans have been studying the art of the oratory [1]. Currently the ability to present effectively is considered to be a core competence for educated professionals [25]. This relevance in learning how to communicate effectively is reinforced by the thought that ideas are the currency of the twenty first century [6]. Research on how to develop public speaking skills is a topic that has already been extensively studied. One of the conclusions to be drawn out of these studies is that practice and feedback are key aspects for the development of these skills [7]. Whereas it is possible to attend different courses and seminars on public speaking, opportunities to practice and receive feedback from tutors or peers under realistic conditions are limited.

Sensors have lately become increasingly popular [8], showing to be a technology with great potential to enhance learning, by providing users with feedback in scenarios where human feedback is not available or to give access to data sources to enhance learning [9]. This has led to the development and research of new sensory technologies designed to support users with the development of their public skills [1013]. These technologies have not been widespread yet, and so far their impact has not been tested outside from controlled laboratory conditions. One of these technologies is the Presentation Trainer (PT), a multimodal tool designed to support the development of basic public speaking skills, by creating opportunities for learners to practice their presentations while receiving feedback [13]. This paper describes a field study where we took the PT outside of the laboratory and tested it in a classroom. The paper discusses the implications of using such a system in the wild, and identifies which of the findings in a lab setting [13] also hold in the real world.

2 Background Work

Educational interventions such as feedback are needed to develop public speaking skills [14]. Having a human tutor available to give feedback on these skills is neither always feasible nor affordable. Therefore, technological interventions designed to provide this feedback are desirable. Public speaking skills require from presenters a coherent use of their verbal and nonverbal channels. Timely measurement of these multimodal performances with an acceptable accuracy is challenging. However, in recent years driven by the rising availability of sensors, research on multimodal learning applications designed to support the development of public speaking skills has been undertaken.

During a presentation, the presenters communicate their messages using their voice together with their full body language, e.g., body posture, use of stage, eye contact, facial expressions, hand gestures, etc. Multimodal learning applications supporting the development of public speaking skills [1016] generally use a depth sensor such as the Microsoft KinectFootnote 1 in order to capture the body language of the user, and microphone devices to capture the user’s voice.

Studies on applications designed to support public speaking skills have been exploring effective strategies to provide feedback to users. In [11] feedback indicating whether the energy, body posture and speech rate is correct or not, is displayed on a Google GlassFootnote 2. Another feedback strategy employed in [10, 15] is the use of a virtual audience. Members of the virtual audience change postures and behaviors depending on the nonverbal communication of the user. Besides the display of the virtual audience the prototype in [10] also provides the user with direct visual indications regarding her own body posture. The applications in [12, 16] provide the user with a dashboard interface that displays a mirrored image of the user together with modules indicating the use of nonverbal communication aspects such as use of gestures, voice, etc. In line with that, the feedback interface of the PT shows a mirror image of the user and displays at maximum one instruction to the user regarding her nonverbal communication at a given time (see Fig. 1). This instruction is communicated to the user through a visual and a haptic channel [13].

Fig. 1.
figure 1

PT telling the user to correct the posture.

The impact of this type of applications on learners has also been studied, showing positive results in laboratory conditions. In the study of [10] the feedback of the system, regarding the closeness or openness of the learner’s body posture, helped learners to become more aware of their body posture. The impact of the PT’s feedback on learners has also been studied in controlled setups. The study in [13] showed, through objective measures made by the system, that after five practice sessions receiving feedback from the PT learners on average reduced 75 % of their nonverbal mistakes.

3 Purpose

In this study we tested the PT in a classroom setting following an exploratory research approach [17], focusing on three main objectives:

Objective 1: The first objective of this study is to explore the implications of investigating the use of a tool such as the PT in a regular learning scenario outside of a laboratory setup.

Objective 2: Studies on multimodal learning applications for public speaking have shown promising results in laboratory conditions according to quantified and timely machine measurements [10, 13]. However, the purpose of a presentation is to transmit the desired message and provide the desired impact to a human audience, in contrast of improving a machine-based score. Studies showing evidence that an improved performance according to machine measurements is reflected in a better presentation according to a human audience are still missing. Therefore, the second objective of this study is to gain insights on how the improvements obtained by a learner using the PT to practice for a presentation relate to the impact that this trained presentation has on the audience. In other words, to what extent does an audience agree with the PT that a presentation improved.

Objective 3: A core competence for current professionals is having good public speaking skills [25]; therefore teaching these skills has become a common target for different courses. Feedback is a key aspect for learning and developing public speaking skills [7], therefore current courses in public speaking include well-established feedback practices to help learners with the development of these skills. The effectiveness of this feedback depends on various variables. One of these variables concerns the source where the feedback comes from. Feedback provided by a tutor in combination with feedback provided by peer students has proven to be more effective than feedback provided only by a human tutor [18]. The third objective of this study, researches the introduction of the PT to the already established practices for teaching public speaking skills, exploring whether its use and feedback contribute to the creation of more comprehensive learning scenarios for students.

4 Methodology

4.1 Study Context

We conducted this study in the setting of a course in entrepreneurship for master students in a university. In this course students were divided in two teams, where each team is represented as an entrepreneurial business. During the course the teams have to develop and present their project. Thus, the students of the course receive some presentation training guidance. The teams have to give a presentation about their projects twice, at the middle and at the end of the course. The middle term presentations are recorded and in following sessions these recordings are used to give feedback to the students regarding their presentation skills, both by tutors and peers.

4.2 Study Procedure

This study was conducted some sessions after the students have already presented their project and received feedback. Nine participants, seven males and two females between the age of 24 and 28 years old took part in the study. A sketch of the study is shown in Fig. 2. To prepare for the study, students got the homework to individually prepare a 60–120 s long pitch regarding their project. One week later the study was conducted during a two-hour session slot.

Fig. 2.
figure 2

Study procedure

The study started with students individually presenting their pitch in front of their peers and course teachers. The objective of this first pitch was to obtain a baseline of the students’ performance. Peers evaluated the pitch by filling in a presentation assessment questionnaire.

After presenting the pitch each student moved to another room for the practice sessions. Before the practice sessions, students received a small briefing regarding the PT’s feedback. The purpose of this small briefing was to reduce the exploration time needed to understand the feedback given by the PT. After this short briefing time, participants were supposed to know how to correctly react to the feedback given by the PT. The practice sessions consisted delivering the pitch two consecutive times while receiving feedback from the PT. During the practice session students stood between 1.5 and 3 m in front of the Microsoft Kinect sensor and a 13-inches display laptop running the PT.

For the next phase of the study, the student returned to the classroom and presented the pitch once more to their peers. The objective of this second pitch was to explore the effects of the practice sessions. To observe these effects, peers evaluated this final presentation once more by filling in the presentation assessment questionnaire. The PT was also used to assess these pitches. However, due to a technical failure only the pitches given by the last three participants were assessed by the PT. After delivering this final pitch, students were asked to fill in a questionnaire regarding the experience of using the PT to practice.

4.3 Apparatus and Material

To evaluate the pitches done by the students, peers filled in a presentation assessment questionnaire. The questionnaire consists of eleven Likert-scaled items. The first seven items refer to a general assessment of the presentation including: the overall quality of the presentation, delivery of the presentation, speaker knowledge about the topic, confidence of the speaker, enthusiasm of the speaker, understandability of the pitch, and fun factor of the pitch. The last four items consisted of some of the specific nonverbal behaviors that can be trained using the PT: posture, use of gestures, voice quality, and use of pauses.

To practice for the second presentation of the pitch students used the current version of the PT. This version of the PT uses the immediate feedback mechanism described in [13], providing users with the maximum of one corrective feedback at the time regarding their body posture, use of gestures, voice volume, phonetic pauses or filler sounds, use of pauses, and facial expressions (45 s without smiling). The PT logs all the recognizable behaviors (mistakes and good practices) as events. It displays these events at the end of each practice the session a timeline (see Fig. 3) allowing learners to get an overall picture of their performance. These logs are stored into files that can later be used for data analysis.

Fig. 3.
figure 3

Timeline displaying all tracked events, showed to the user after the presentation.

A user experience questionnaire was used to capture the impressions of the students regarding the use of the PT. This questionnaire consists of seven items in total, five Likert-scale items and two open questions. The purpose of this questionnaire was to inquire the learning perception, usefulness of the system, and comparison between human assessment and system assessment.

5 Results

The peer evaluation of the first pitches is shown in Fig. 4. Regarding the general aspects of the pitch, the item with the best score was the knowledge about the topic displayed by the presenter with an average score of 3.76 and the item with the lowest score was the entertaining factor of the pitch with an average score of 3.1. The nonverbal communication behavior with the highest score was the voice quality of the presenter with an average score of 3.73 and the behavior with the lowest score was the proper use of pauses during the pitch with an average score of 3.21.

Fig. 4.
figure 4

Evaluation scores of the first pitches.

After giving the first pitch, students practiced it two times using the PT. We analyzed these practice sessions using the logged files created by the PT. To evaluate the impact of each of the identified behaviors captured by the PT, we used the percentage of time that this behavior was displayed during the training session (pTM). The pTM value for each behavior has a range from 0 to 1, where 0 indicates that the behavior was not displayed at all and 1 indicates that the behavior was identified throughout the whole presentation. The average pTM values for all the tracked behaviors are displayed in Table 1. Results indicate that participants on average during the second practice session show an improvement in all trained aspects. The behavior that on average received the worst assessment for the first practice session was the use of gestures, followed by the voice volume and then posture. The pTM value for the other tracked behaviors was very low. In the second practice session voice volume received the worst assessment, followed by gestures and then posture. The area showing the biggest improvement was the use of gestures.

Table 1. pTM scores capture during the practice sessions. Mean and standard deviation.

The peer evaluation of the pitches presented after the practice sessions is shown in Fig. 5. Regarding the general assessment of the pitches the item with the highest score was the knowledge about the topic displayed by the speaker with an average score of 3.96. The item with the lowest score having an average of 3.55 was the entertaining factor of the pitch. Regarding the nonverbal communication aspects, the one with the highest score was the voice quality of the presenter with and average of 4.14 and the correct use of pauses was the lowest with and average of 3.71.

Fig. 5.
figure 5

Evaluation scores of the second pitches.

To explore the relevance of having a tool designed to practice specifically the delivery of the pitch, we used Pearson’s r to measure the correlation between the scores of the overall quality of the pitch (content + delivery) and the scores of its delivery. These measurements show a correlation of [r = 0.94, n = 18, p < 0.01]. We also used Pearson’s r on the scores of the pitches to measure the correlation between the behaviors that can be trained using the PT and the overall quality of the presentations (see Table 3). This with the objective to explore the relevance of training these behaviors. The behavior displaying the strongest correlation was the use of pauses, followed by posture, voice quality and use of gestures.

Table 3. Pearson’s linear correlation. Mean and standard deviation.

Figure 6 shows the comparison in the evaluations between the first and second pitches. These comparisons show and improvement in all evaluated items. The general quality of the pitches increased on a 21.94 %. We calculated the significance of this difference using a t-test. The result of this t-test was t(14) = 3.6, p < .01. This indicates that the improvement observed is statistically significant. Regarding the general aspects of a presentation the delivery of the pitch was the item displaying the biggest improvement showing an increment of 24.27 %. The item showing the lowest improvement was the knowledge about the topic displayed by the presenter. This item had an improvement of only 14.37 %.

Fig. 6.
figure 6

Comparison between first and second pitch

By examining the improvements on the nonverbal communication behaviors, the area that displayed the biggest improvements was the use of gestures with an increment of 27.89 %.

The PT’s assessment the second pitch for the last three speakers is shown in Table 2 Footnote 3. Results from these tracked performances show that all of them had a total pTM value lower than 1.

Table 2. pTM values for the last three speakers on their final pitches.

Results from the user experience questionnaire are listed in Table 4. These scores show that students would likely use the PT to prepare for future presentations. Results show that students perceived an increment of their nonverbal communication awareness. Students felt that the feedback of the PT is more useful as an addition rather than as a reinforcement of the feedback that peers and tutors can provide.

Table 4. Results from the user experience questionnaire. Mean and standard deviation.

When asking students about the similarities between the PT’s and the feedback received in previous sessions by tutors and peers all students mentioned the correct use of pauses while presenting. Two of them also mentioned the use of gestures. Four students mentioned that, previously, they received the feedback of not given enough eye contact to the audience by their tutors and peers and that this aspect is missing in the PT’s feedback. Three students commented that receiving immediate feedback by the system makes it much more easy to identify and correct their behavior. One student mentioned that the PT gave feedback regarding the phonetic pauses while peers and tutors did not. One student mentioned a contradiction between the feedbacks regarding the use of voice. Peers and tutors in a previous presentation told the participant to speak louder, and during the training sessions the PT told the participant to speak softer.

6 Discussion

Studying the use of the PT outside of the laboratory in a real life formal learning scenario has several implications. In studies conducted in the lab, the setup of the experiment is carefully designed, allowing experimenters to have full control of variables such as time of each experimental session, location and instruments. This control allows the acquisition of reliable and replicable results. For this study we had to adapt our setup according to the restrictions of the ongoing course followed by the students. We encountered two main challenges while designing and conducting our study: time and location.

Regarding time, in previous laboratory studies participants had individual timeslots of sixty minutes, where they received all the briefing necessary and had five practice sessions with the PT. Moreover, experimenters had the chance to conduct their study with a large enough control and a treatment group, allowing them to assess significant results [13]. For this study we had two hours to conduct the whole experiment without knowing beforehand the amount of students that would show up that day for the course. Therefore, we reduced the training sessions from five to three and adapted to only two training sessions during the flow of the experiment. The act of training with the PT is individual and designed to be performed in a quiet room where the learner can focus on the task. That forced us to use a separate room where one student could do the practice session while the others waited in the lecture room. The room used for the practice sessions was not designed for the setup of the PT. The location of the power plugs, lighting conditions, place to position the Kinect and laptop screen running the PT were far from ideal. This problem of not having the ideal practice setup partially explains the difference between the average pTM values obtained in this study and the ones obtained in laboratory conditions [13]. In lab conditions the average values from the first and second training sessions were 0.51 and 0.32 respectively, while in this study they were 0.69 and 0.41. Nevertheless, despite the differences the values did show a similar trend displaying similar improvements in a less than ideal setting.

Previous studies showed that using the PT to practice for presentations improves the performance of the learner according to the measurements tracked by the PT [13]. The second objective of this study was to investigate whether using the PT to practice a presentation has also an influence in the way that the audiences perceives it. Results from this study showed that according to a human audience, all participants performed better in all aspects after having two practice sessions with the PT. The restricted time slot and restricted number of participants, did not allowed us to make use of a controlled and a treatment group. Therefore it is not possible to directly determine whether the improvements perceived by the audience are the results of practicing with the PT or just practicing. The results, however, revealed three key aspects suggesting the influence of the PT on this perceived improvement. The first key aspect is revealed by the assessed improvements regarding the general aspects of a presentation. The item showing the least improvement between the first and the second pitch is the knowledge that the presenter displayed regarding the topic. While on the other hand the item showing the biggest improvement was the delivery of the pitch. This aligns with the fact that the focus of the practice sessions using the PT was purely on the delivery of the pitch.

The second key aspect pointing out the influence of the PT has to do with the use of gestures. Use of gestures exhibited the biggest improvement from the first human assessed pitch to the second. This aligns with the computer assessment from the two practice sessions, where the aspect exhibiting the biggest improvements was also the use of gestures.

The third key aspect suggesting the influence of the PT is the PT’s assessment of the three of the nine final pitches. In previous studies the average total pTM for presentations of people who did not practice with the PT was close to 1.0, in contrast with the results shown in this study where all the three measured final pitches had total pTM below 0.67. Unfortunately, as mentioned before, due to technical and logistical difficulties we were not able to assess all pitches using the PT.

For the third objective of this study we investigated whether the introduction of a tool such as the PT can contribute to the creation of more comprehensive learning scenarios for the acquisition of public speaking skills. Results from our study support this. As seen in the evaluations of the first pitch, the highest evaluated aspect was the knowledge of the topic displayed by the presenter. This gives us a hint that when preparing for a presentation or a pitch, a common practice is to focus efforts on preparing only its content. This practice does not seem optimal according to the strong correlation measured in this study between the overall quality of a pitch and the quality of its delivery. The results illustrate how by practicing the pitch two times using the PT, students significantly improved the overall quality of it. The students also reported benefits regarding their experience of using the PT to practice. They affirmed that the practice sessions helped them to learn something about public speaking and increase their nonverbal communication awareness. It is interesting to note that according to the students the feedback of the PT complements the feedback received by tutors and peers. Three students stated that the immediate feedback received by the PT helped them to exactly identify and correct their behavior. One more important aspect to note is that students expressed the intention to use the PT in the future.

This study showed some benefits of using of a tool such as the PT to support common practices for learning public speaking skills. However, the introduction of such a tool is still a challenge. The Microsoft Kinect is not a product owned by many students, and it is not feasible to provide each student with a Kinect in order to train some minutes for their presentations. However, Intel is already working in the miniaturization of depth cameras that can be integrated to laptop computersFootnote 4. Therefore, in a medium term it will become more feasible for students to have access to tools such as the PT and use them for home practice. In the meantime the introduction for dedicated places to practice the delivery of presentations would be needed in order to introduce the support of these types of tools to the current practices for teaching and learning public speaking skills.

7 Conclusion and Future Work

The creation of multimodal learning technologies to support the development of public speaking skills has been driven in recent years by the advances and availability in sensor technologies. In laboratory settings some of these technologies have already started to show promising results. In this study we took one of these technologies, the Presentation Trainer, outside of the lab and conducted some tests with students following an entrepreneurship course as part of the course agenda. The main purpose of this study was to start the exploration of the support that these technologies can bring to a formal learning scenario.

Studying the use of the PT for a real classroom task revealed that location and time constrains interfere with the straightforward conduction of research. Due to location constrains it was not possible to set up the PT in ideal conditions for its use. Due to time constrains it was not possible to have the students follow all the expected training sessions, and we were not able to use the PT to measure all the first and second pitches presented to the audience. These constrains do not allow us to determine the causes for some of the obtained results in this study. However, results from this study align to a large extend with results obtained in the lab [13].

Regarding the support that the use of a tool such as the PT can bring to the established practices of teaching and learning public speaking skills, results from this study show the following:

  • Students see themselves willingly using a tool such as the PT to practice for future presentations.

  • Students find the feedback of the PT to be a good complement to the feedback that peers and tutors can give.

  • Practicing with the PT leads to significant improvements in the overall quality of a presentation according to a human audience.

For future work we plan to show the results obtained in this study indicating the advantages of using the PT to coordinators of public speaking courses. This comes with a plan to deal with environmental constraints impeding the setup of PT and, hence, its use in the wild. Furthermore we plan to continue improving the PT. The purpose of the PT is to help humans give better presentation to humans. Hence, we plan to explore the relationship between human-based and machine-based assessment, and study how this information can later be used to provide learners with better feedback.

To conclude, there is still a lot of room for improvement for multimodal learning applications designed to support the development of public speaking skills. Introducing them to formal and non-formal educational scenarios still has some practical challenges. Though the application of the PT in a practical setting may not require equally strict conditions as in our research. In any case, studying the use of the PT in the wild has shown promising results regarding the support that such tools can bring to current practices for learning public speaking skills, indicating how courses on developing public speaking skills can be enhanced in the future.