The multi-modal nature of the Patient Consultation Corpus allows its data to be analysed from a variety of different perspectives. This not only has significant value within individual research areas, but also provides opportunities to examine connections between them. Here, we briefly outline four ways in which the data in the Patient Consultation Corpus can be analysed: from the perspectives of models of structured dialogue, virtual agent design, communication intent and style, and interpersonal stance. Note that for each perspective, we do not describe a full analysis nor discuss multiple alternative approaches because our intention is only to show that the Patient Consultation Corpus can be analysed in these ways; we leave full analyses to future work.
Models of structured dialogue
Analysing the dialogical structure of multi-party interactions can help understand how those interactions unfold and the strategies that participants adopt in order to reach different outcomes. Even exchanges that seem relatively trivial can contain linguistic and strategic nuances that only become apparent under close analysis. By analysing the Patient Consultation Corpus in this way, we can therefore obtain insights into the ways in which individual practitioners handled patients with different personality types.
Inference Anchoring Theory (IAT) is an analytical framework which enables the structure of dialogues to be represented by extracting the illocutionary force of the locutions (Budzynska and Reed 2011). The structure in IAT is described as “the shape of the discussion” and it aims to represent how the participants’ dialogical moves combine to form an argument. Encompassing Speech Act Theory (Searle 1969), IAT also allows the relationship between speech acts to be represented. Using IAT to analyse the Patient Consultation Corpus reveals the dialogical structure of the individual consultations, thus providing an understanding of the ways in which they can unfold and the strategies the health care practitioners adopt. Furthermore, IAT analyses can feed into the design and development of reusable models of dialogue using processes such as those proposed by Snaith and Reed (2016). Such models can subsequently be used to underpin dialogue-based health care support systems.
An example IAT analysis, created using the Online Visualisation of Argument (OVA+) tool (Janier et al. 2012), is shown in Fig. 6. This example shows the analysis of a small (254 word) excerpt from the Patient Consultation Corpus, chosen to illustrate the core IAT concepts. The magnified section shows the connection between the dialogical process on the right, and the resultant argument on the left. In a dialogue, individual utterances are connected by dialogical transitions, while transitions and utterances are connected to the argument structure by illocutionary forces (e.g. “Asserting”, “Disagreeing”). In an argument, individual statements can support, attack or rephrase each other; these are represented by rule applications (e.g. “Default Inference”), conflict applications (e.g. “Default Conflict”), and rephrase applications (e.g. “Default Rephrase”) respectively.
Virtual agent design
There are currently several applications being developed in the Intelligent Virtual Agents research domain where virtual agents are being utilised more as a coach or an assistant than just as a tool to provide information. Researchers are working towards making these agents as human-like as possible by advancing their communicative abilities and social behaviours. Non-verbal behavioural cues like gaze, facial expressions, gestures, and body postures etc., indicate the attitude of a given individual in any social situation (Richmond et al. 1991) and convey information about affect, mental state, personality, and other traits (Vinciarelli et al. 2009). Studies involving human–human interaction can be used to understand the role of verbal and non-verbal behaviours in conversations and incorporate the same into the virtual agents.
The MUMIN multimodal scheme allows for the annotation of multimodal communicative behaviours from the perspective of three communicative functions, namely, feedback, turn management and sequencing (Allwood et al. 2007). Feedback provides information about the interactions through signals such as facial expressions, turn management regulates the interaction flow such as turn gain and turn hold, and sequencing deals with the organisation of a dialogue in meaningful sequences.
To facilitate such annotations, the video recording setup in the Patient Consultation Corpus was designed to capture behavioural cues on two levels. The first is at the individual level, where we aim to capture the non-verbal cues such as gaze behaviour, facial expressions, head movements, and hand gestures and body movement of a single individual. The second is at the group level, where we aim to capture the turn-taking behaviour: how and when individuals take turns to speak or facilitate others to speak, the interpersonal attitude, and the postural congruence. These behaviours help us in understanding the relationship, interpersonal attitude and role of the individuals in the group and can facilitate in modelling virtual agents to fit a specific role e.g., we can study the non-verbal behaviours of a human doctor and model a diabetic coach to emulate the their nature.
Coaching communication intent and style, and interpersonal stance of coaches
When a medical practitioner communicates something to a patient, it is important to consider not only what they communicate, but also how they communicate it, and how it comes across. Furthermore, they need to be able to adjust to changes in stance of the patient.
The audio–visual setup in the Patient Consultation Corpus allows us to make use of annotation schemes that examine: intent behind communication (e.g. Verbal Response Modes (VRM; Stiles 1992); the form of communication (e.g., Interaction Process Analysis, IPA Bales 1951); and the interpersonal stance of participants (e.g. the Interpersonal Circumplex, IPC Leary 1957).
The VRM annotation is concerned with what people do by saying something, and not as much the content of what they say. It tries to describe the relation of the speaker to the other in a discourse. It was made to be a general purpose tool to classify speech acts. The IPA annotation is focused on describing the kind of behaviour and the message it conveys. It originates from annotation of conversations had during group work. Broadly speaking, this concerns the type of communication that is being used and classification as task-related communication versus social–emotional communication. The IPC annotation is more focused on the type of personality people convey through the stance they take during discourse. It focuses on the dominance versus submissiveness shown, and the hostility versus friendliness shown. It originates from observations made in psycho-therapeutic settings.
Figure 7 shows part of the VRM, IPA, and IPC (here LR) annotation for the same excerpt analysed using Inference Anchoring Theory (Sect. 5.1). It shows annotation for the behaviour of each coach, and for each annotation scheme. For some schemes, we made separate tracks for different categories of behaviour within the models they were based on. We plan to annotate more Excerpts in the near future to gain more insight into interactions between coaches and their patient.