An Active Analysis and Crowd Sourced Approach to Social Training
Interactive narrative (IN) has increasingly been used for social skill training. However, extensive content creation is needed to provide learners with flexibility to replay scenarios with sufficient variety to achieve proficiency. Such flexibility requires considerable content creation appropriate for social skills training. The goal of our work is to address these issues through developing a generative narrative approach that re-conceptualizes social training IN as an improvisation using Stanislavsky’s Active Analysis (AA), and utilizes AA to create a crowd sourcing content creation method. AA is a director guided rehearsal technique that promotes Theory of Mind skills critical to social interaction and decomposes a script into key events. In this paper, we discuss AA and the iterative crowd sourcing approach we developed to generate rich, coherent content that can be used to develop a generative model for interactive narrative.
KeywordsIntelligent narrative technologies Active analysis Theory of mind Crowd sourcing Social skills training
Effective social interaction is a critical human skill. To train social skills, there has been a rapid growth in narrative-based simulations that allow learners to role-play social interactions with virtual characters, and, in the process, ideally learn social skills necessary to deal with socially complex situations in real-life. Examples include training systems to address doctor-patient interaction , cross-cultural interaction within the military , and childhood bullying .
Although interactive narrative systems represent a major advancement in training, their designs often do not provide learners with flexibility to replay scenarios with sufficient variety in choices to support learning and achievement of proficiency. Attempts to create more generative experiences face various challenges. First, the combinatorial explosion of alternative narrative paths poses an overwhelming challenge to create content for all the paths, especially if there are long narrative arcs. Second, current narrative systems are often brittle and constrained in terms of their generativity and flexibility to adapt to the interaction.
Additionally, training systems are often designed around exercising specific skills in specific situations. However, it is also important for social skill training to teach skills more broadly. Fundamental to effective human social interaction is the human skill to have and use beliefs about the mental processes and states of others, commonly called Theory of Mind (ToM) . ToM skills are predictive of social cooperation  and collective intelligence  as well as being implicated in a range of other social interaction constructs, e.g., cognitive empathy  and shared mental models . Although children develop ToM at an early age, adults often fail to employ it . On the other hand, people engaging in ToM across multiple situations, including actors, have improved ToM skills .
Our goal is to develop generative social training narrative systems that support replay as well as embed ToM training. Achieving this requires: (1) a new generative model for conceptualizing narrative/role-play experiences, (2) new methods to facilitate extensive content creation for those experiences and (3) an approach that embeds ToM training in the experience to support better learning outcomes.
Our approach begins with a paradigm shift that re-conceptualizes social skill simulation as rehearsing and improvising roles instead of performing a role. We have adapted Stanislavsky’s Active Analysis (AA) rehearsal technique  as the design basis for social simulation training. AA was developed to help theater actors rehearse a script or text. The overall script is divided into key events (i.e., short scenes) that actors rehearse and improvise under a director’s guidance. AA has two attributes especially relevant to social skills training. First, AA is designed to foster an actor’s conceptualization of the beliefs, motivations and behavior of their own as well as other actors, and thus is developed to engender ToM reasoning. Second, by adopting AA to simulation based social skills training, the emphasis shifts to developing short scenes that allow variability and re-playability. Decomposition into short rehearsal scenes helps: (a) break the combinatorial explosion that exacerbates content creation for long narrative arcs, (b) support users replaying scenes, possibly in different roles with different virtual actors, and (c) users to directly experience in subsequent scenes the larger social consequences of behaviors.
AA provides a basis for experience design that uses ToM constructs and enables crowd sourcing to generate content for rich, coherent interactive experiences. Several researchers have proposed crowd sourcing techniques for narrative creation (e.g., [12, 13]). The work presented here differs in that we focus on an iterative crowd sourcing method designed to create content for crafting a space of rich social interactions in which players explore a wide range of social gambits, from ethical persuasion and personal appeals to even deception; the content is created through the crowd using carefully designed tasks and interfaces that use AA and ToM as theoretical foundation.
In this paper, we present our approach in detail. Specifically, we will start by discussing the theoretical foundation for our work: Active Analysis. We then outline our overall approach to interactive narrative system design. Following this, we discuss the 3-step iterative crowd sourcing approach outlining the method, as well as the results for each step of the process. We then discuss overall results of the current work and projections for future work. Following this discussion, we outline previous related work and then conclude the paper.
2 Active Analysis Primer
AA serves several purposes in our project. It defines the overall structure of the learning experience as rehearsals of key scenes. Also, it helps define ways to manipulate the challenges the learner faces in those scenes as well as the feedback the learner gets. Most critically, AA provides a novel and potentially very powerful tool for ToM and more generally social skills training. Through AA, the human role-player/learner is called upon to conceive their behavior in terms of how other participants as well as observers understand and react to it. Further, they may have flawed mental models of others. Additionally, re-conceptualizing interactive narrative as performance rehearsal brings in elements of game play, such as repeat play as a form of rehearsal/mastery.
Stanislavsky developed AA in the latter part of his career to transform actor training and performance rehearsal techniques by emphasizing the importance of cognitive-social-affective mental states. As prelude to actors engaging in active analysis, the script is broken into small key scenes that the actors improvise and analyze. For our purposes in automating AA, this breakdown of a script would be part of the design, and therefore a larger training scenario is assumed to be broken into these brief key events/scenes. Consider the following simple multi-scene scenario that is currently the basis of our work:
Scene 1: A novice reporter (R) witnesses a passenger attacking other passengers and a black good Samaritan tackles the rampaging passenger to protect the other passengers. A policeman enters and arrests the good Samaritan.
Scene 2: R goes to interview police chief (PC) at his house but a guard (G) at the house may have orders to block reporters.
Scene 3: R meets PC to persuade him to let the good Samaritan go.
AA of a scene consists of three phases: Framing, Improvisation and Performance Analysis. These phases are repeated, the rehearsal director often changing the motivations and tactics of the actors as well as the roles they play.
Framing: The actors and director determine what the overall context of the scene is and what event (or goals) should occur. This guides the actors as they improvise. Some actors will play ‘impelling actions’ which move purposefully toward the event. Others play ‘counter-actions’ which resist, delay or block the targeted event/goal. Using Scene 2 from the above scenario, impelling actions for R might be to explain or justify her purpose to meet PC, to elicit empathy from G or even threaten or attack G. The rehearsal director helps the actors explore different motivations and approaches to their role. In particular, he/she can selectively provide information about goals and actions to an actor without the other actors knowing that information. Thus, an actor may have a flawed mental model of the other actors leading into the improvisation. This mental model manipulation provides a powerful tool in social skills training.
Improvisation: During improvisations, actors explore different tactics to achieve their goals. Sometimes this improvisation fails without the goal being achieved and the actors not knowing how to proceed. This also can be a key point for the design of social training simulations, as sometimes the interaction may just simply break down and a different approach must be taken.
Performance Analysis: After completing the improvisation, the director and actors evaluate the work at three levels of analysis: (1) the individual, the actor thinking of their own actions, (2) the social, how those actions are countered by other actors, and (3) the audience, how the actions expressed information to others observing the performance. The actors then revise their preliminary analysis and repeat the improvisation. This process of framing, improvisation and analysis is repeated until the group has achieved what they wish.
3 Overview of the AA Social Skill Training System
Figure 1 presents the current overall design of our system. In line with AA rehearsals, we assume the overall scene is broken into highly variable interaction scenes that constitute key events. These scenes will be performed by the human and virtual actors through the director agent’s guidance. To represent this structure, we use a high level state graph adapted from StoryNet . The nodes in each scene represent the AA’s key events/scenes. They are free-play areas in which the human participant can rehearse their role with virtual characters, exploring different approaches to their role within a scene. The edges in the graph connect the scenes. Traversal along an edge constitutes a completion of the rehearsal of a scene and a move onto a subsequent scene to rehearse. This traversal is dependent on whether the rehearsal has achieved its goal or some minimal number of improv sessions have occurred. Framing sets up the learner’s role in the scene. Performance Analysis, as in AA, provides feedback on the characters’ beliefs and motivations as well as cues on how to achieve scene goals. This feedback would leverage knowledge acquired during the crowd sourcing process.
4 Within Scene Content
4.1 Creative Step: Crowd Sourcing Scenes
The task for the scene creation is to create very different variants of how a particular scene plays out from which a space of possible actions can be isolated and recombined to create a rich interactive experience. We designed and deployed multiple interfaces. Due to space constraints we only discuss the most promising design, the AA Interface, which was based on AA techniques for eliciting variations from actors improvising a scene. The AA interface seeks to limit the worker to craft relatively simple sentences where a sentence describes one action that a character performs (in line with ). The same scenario description was used for both tasks: R , an aspiring reporter, wants to interview PC at his house. PC employs G to make sure that nobody is able to bother him. G guards the door. R is in front of PC ’s house and is planning to attempt to interview him.
Story Collection Results: We collected 108 stories using the same AA interface. Most stories consisted of adjacency and act/react pairs, where R would take an action and G would then respond. The average length of a story was 6 lines.
The stories are very rich in terms of character actions and the intention of each action. This applies to R in particular, while G’s behaviors varied in terms of responses to actions. The intentions for the actions however, remain relatively static throughout for most of the stories until the conclusion is reached where G either changes his goal of blocking R or successfully blocks R. The stories collected show an impressive range in complexity, richness and variety in tactics that R employs, resulting in a sizable action set for our characters. To give a better sense of the variety of actions collected, we provide two examples, a relatively simple bribe story and a more complex manipulation story:
Bribe: R asks G to see PC. G declines her request. R tries to bribe G. G declines R’s bribe. R adds $50.00 more to the bribe. G accepts the bribe.
Manipulation: R flirts with G. G tries to ignore R. R compliments G a lot. G begins to flirt with R. R tells G that she just needs a few teeny minutes with PC. G becomes wary and tells her to leave.
4.2 Analytical Step: Crowd ToM Annotation
As noted, ToM annotations are critical for: the pedagogy provided during Performance Analysis and constraining how actions are intermixed to create variation for the learners’ experience in the interactive narrative. To get this information, we conducted a second crowd task using the annotation interface seen in Fig. 4.
Initial goals and beliefs given to the workers
The Reporter (R)
meet with the police chief (PC)
PC is in the house
G is likely to block her
The Guard (G)
block R from entering the house
PC is in the house
PC doesn’t want to meet any reporters
R is likely a reporter
Annotation Result and Discussion: We randomly selected 3 stories to annotate from the story corpus obtained through the first crowd stage, described in the previous section. Data from 120 workers was collected; each story was annotated by 40 workers. For each action, the 2 most agreed upon combinations were selected as that action’s potential effects.
Examples of the annotation result
Intentions (wants to change)
R dresses up like a postman
G’s belief about she is likely a reporter
G’s goal to block her from entering the house
G accepts the bribe
G’s own goal to block R from entering the house
G’s own goal to please PC
R tells G the interview has already been scheduled
G’s belief about PC does not want see reporter
G’s goal to block R
4.3 Assessment Step: ToM Causal Model
To develop an interactive system with a rich and generative action space, we need to reorganize and recombine collected actions but maintain the coherence of the story. As discussed above, we decided to use the crowd to assess stories developed through the crowd sourcing process. However, the combinatorial explosion caused by random recombination makes it a daunting task for workers to assess the result. For instance, in the 3 stories we annotated, there are just 20 sentences including 10 actions for each character. If we randomly combine 6 sentences to generate a new story, the number of possible combinations is \(10*10*9*9*8*8 = 518400\). Most of these narratives are nonsensical or incoherent. Thus in order to eliminate them, the crowd provided ToM annotation data was used to create a causal model, namely the ToM Causal Model (TCM), that establishes a precedence relation between sentences. The basic assumption for the TCM is that the intention of an action selected by a character should be consistent with the intention of previous actions in the story, which can be used to determine what actions follow a particular prefix of actions in a generated story.
The TCM is based on the idea of an intention being in play. An intention is in play at some point in a seed story, if in that story a preceding action has that intended effect. For example, from the annotation data we know the intended effects of ‘bribe’ are ‘changing G’s goal of blocking her’ (I1) and ‘increasing G’s liking of her’ (I2). Thus, after ‘bribe’ happens, I1 and I2 are defined as ‘in-play’.
From this information, we can construct an abstract model of actions that describe their in play intentions as well as intended effects, as a heuristic model of preconditions and effects. Specifically, an action is defined as a tuple \(A = <PI, EI, \varPhi>\) where PI is a set of ‘in-play’ intentions that existed when action A happened in the seed story and EI is the set of intended effects by taking this action. \(\varPhi \) defines the phase where the action occurs: beginning, middle or end.
When generating new stories, an action can only be selected if all the intentions in its PI are in play. For example, in the ‘bribe’ story, R bribes G. After the bribe is declined, she bribes $50 more. Thus, ‘bribe-more’ can only happen when intentions I1 and I2 are in play. Given our TCM model, these constraints: intentions I1 and I2, are set as PI for the action ‘bribe-more’. After an action is selected, its intended effects EI will added to the set of ‘in-play’ intentions. Additionally, we constrained the action selection based on the phase \(\varPhi \). We assume the initial/ending actions in the seed stories like ‘greet’, ‘introduce herself’, ‘let R in’ or ‘arrest R’ are more likely to happen at the beginning/end of the story. All intermediate actions can happen at any place in between.
The TCM Result and Discussion: Obviously, the TCM is not a perfect coherency filter, in particular it over prunes. Still, pruning makes it feasible to further test coherence using the crowd. So, we asked another group of workers to evaluate the results using 5-point Likert items adapted from  to assess stories’ coherence and consistency: I understand the story; The story makes sense; Every sentence in the story fits into the overall plot; The characters behaviors are consistent with their goals and beliefs; The characters’ interaction is believable.
We collected evaluations from 40 workers for each story. We mapped the 5-point Likert items to a score ranging from \([-2, 2]\). We eliminated all stories that have at least one non-positive score leaving us with 12 stories1. To validate the workers’ assessment, we asked an expert who is familiar with ToM and interactive narrative to evaluate all generated stories. The crowd result was generally more selective than the expert’s. We found one story with an illicit action pair (decline bribe before bribe is offered) excluded by the expert but not by the crowd. This suggests we may need to consider alternative criteria. For example, eliminating stories based on total score across 5 questions less than a threshold \(T_e = 2.5\) leads to better agreement with the expert.
5 Discussion and Future Directions
Our results show the promise of adapting AA and crowd sourcing to generate rich and varied stories from a simple initial scene. The results indicate the utility of designing crowd tasks using AA principles to inspire creative output. In particular, we found directorial guidance from the AA interface can directly influence the stories. For example, the AA interface successfully primed the workers to focus on manipulative and deceptive behavior. This suggests that going forward we will be able to influence the worker to get the needed pedagogical content.
We also found in the ToM annotation task, the crowd was able to annotate these stories with sophisticated ToM concepts and with high agreement. We showed how these annotations can provide critical pedagogical content but we also demonstrate how they can be incorporated into the TCM to help assess generated stories. This led to a significant reduction in the number of stories and enabled us to successfully go to the crowd to do a final validation check.
While giving us significant amount of ToM information, the ToM annotation scheme is still simple. Adding more layers in subsequent schemes might get to more detailed information such as tactics. The high agreement suggests the crowd might be capable of performing more detailed annotation. Using finer-grained annotation schemes would help categorize content to a greater degree, benefiting both the design of the pedagogy and the performance of the TCM.
In addition, we can improve the TCM in other ways to ensure its scalability. With the assessment result, we can construct a stronger TCM by abstracting the conditional relation between 2 actions based on whether they (1) have high mutual information in good stories and (2) are only found in reverse order in eliminated stories. With these 2 criteria, we can derive 2 conditional constraints, ‘bribe more’ and ‘decline bribe’, that only happen if ‘bribe’ already occurred. These causal relations, combined with TCM can be used to filter out incoherent stories. As an alternative to using the TCM, we will also explore machine learning techniques to learn a probabilistic model that can evaluate the generated stories.
To date, we implemented a prototype, using the crowd results reported here, that allows a player to play the role of R or G. This prototype does not include the AA director agent however, depicted in Fig. 1. To that end, our next step is to collect data from the rehearsal practice of leading expert on AA, which will be used to inform the director agent design.
6 Previous Work
Successful examples of systems using crowdsourced narratives to generate new stories include SayAnything , the Scheherazade System  and, its interactive counterpart . Crowdsourcing has also been used to model behavior such as in the Restaurant Game . A key challenge in our work lies in populating the action space with complex social actions, especially since a sufficient number can mean thousands of actions. For example, the social game PromWeek  utilized over 5000 human-authored social rules, a complex and time consuming task and,  uses a set of known actions. Our approach has actions extracted from crowdsourced narratives but additionally, uses those actions to generate new stories to explore the possible causal relationships between them.
Annotating potentially large narrative corpora, like those generated in our work is also a non-trivial task. The Scheherazade annotation tool  utilizes Story Intention Graphs (SIG) which capture ToM concepts. However, SIG requires expertise. Annotation frameworks often have a detailed structure requiring expertise to achieve adequate levels of inter-rater reliability, giving rise to much simpler annotation schemes, such as the two compared in . Neither of those two however focuses explicitly on annotating detailed social interactions nor ToM reasoning. Any annotation scheme we use needs to reflect our ToM focus, yet be simple enough not to require training or expertise of the worker.
 also uses a multi-step iterative crowd sourcing approach to content creation, similar to the work presented here. However, their work focuses on activity oriented non-interactive narratives and actions, whereas we focus on interactive narratives involving social interaction and ToM reasoning.
In terms of focus on ToM, our work is similar to the interactive drama system Thespian . However, Thespian employs ToM reasoning to generate behavior by modeling the player and characters as autonomous agents, while we use ToM to describe and recompose character behavior without an explicit agent model.
In this paper, we introduced an iterative crowd sourcing method for interactive narrative, based on AA and ToM. Designed to realize a space of rich social interactions, it will allow players to explore a wide range of social gambits, from ethical persuasion and personal appeals to deception. The result of using AA and ToM as theoretical foundations show the promise of using such a framework to collect, annotate and generate interactive narratives for social skills training. Going forward, we plan to use a data-driven technique to create a director agent that will incorporate more aspects of AA into the experience.
Funding for this research was provided by the National Science Foundation Cyber-Human Systems under Grant No. 1526275.
- 1.Lok, B., Ferdig, R.E., Raij, A., Johnsen, K., Dickerson, R., Coutts, J., Stevens, A., Lind, D.S.: Applying virtual reality in medical communication education: current findings and potential teaching and learning benefits of immersive virtual patients. Virtual Reality 10(3–4), 185–195 (2006)CrossRefGoogle Scholar
- 2.Kim, J.M., Hill, J., Durlach, P.J., Lane, H.C., Forbell, E., Core, M., Marsella, S., Pynadath, D., Hart, J.: BiLAT: a game-based environment for practicing negotiation in a cultural context. IJAIED 19(3), 289–308 (2009)Google Scholar
- 3.Zoll, C., Enz, S., Schaub, H., Aylett, R., Paiva, A.: Fighting bullying with the help of autonomous agents in a virtual school environment. In: 7th International Conference on Cognitive Modelling (2006)Google Scholar
- 4.Whiten, A.: Natural Theories of Mind: Evolution, Development and Simulation of Everyday Mindreading. Basil Blackwell, Oxford (1991)Google Scholar
- 8.Converse, S.: Shared mental models in expert team decision making. In: Individual and Group Decision Making: Current (1993)Google Scholar
- 11.Carnicke, S.M.: Stanislavsky in Focus: An Acting Master for the Twenty-First Century. Taylor & Francis, New York (2009)Google Scholar
- 13.Orkin, J., Roy, D.K.: Understanding speech in interactive narratives with crowd sourced data. In: Proceedings of the 8th AAAI Conference on Artificial Intelligent and Interactive Digital Entertainment. The AAAI Press (2012)Google Scholar
- 14.Swartout, W., et al.: Toward the holodeck: integrating graphics, sound, character and story. In: Proceedings of 5th International Conference on Autonomous Agents, pp. 409–416. ACM (2001)Google Scholar
- 15.Li, B., Appling, D.S., Lee-Urban, S., Riedl, M.O.: Learning sociocultural knowledge via crowdsourced examples. In: Proceedings of the 4th AAAI Workshop on Human Computation (2012)Google Scholar
- 16.Guzdial, M., Harrison, B., Li, B., Riedl, M.O.: Crowdsourcing open interactive narrative. In: The 10th International Conference on the Foundations of Digital Games (2015)Google Scholar
- 18.Li, B., Lee-Urban, S., Johnston, G., Riedl, M.: Story generation with crowdsourced plot graphs. In: Proceedings of the 27th AAAI Conference on Artificial Intelligence (2013)Google Scholar
- 19.Orkin, J., Roy, D.: The restaurant game: learning social behavior and language from thousands of players online. J. Game Dev. 3(1), 39–60 (2007)Google Scholar
- 20.McCoy, J., Treanor, M., Samuel, B., Reed, A.A., Mateas, M., Wardrip-Fruin, N.: Prom week: designing past the game/story dilemma. In: FDG, pp. 94–101 (2013)Google Scholar
- 21.Elson, D.: Modeling narrative discourse. Ph.D. thesis, Columbia University (2012)Google Scholar
- 22.Rahimtoroghi, E., Corcoran, T., Swanson, R., Walker, M.A., Sagae, K., Gordon, A.: Minimal narrative annotation schemes and their applications. In: 7th Intelligent Narrative Technologies Workshop (2014)Google Scholar
- 23.Sina, S., Rosenfeld, A., Kraus, S.: Generating content for scenario-based serious-games using crowdsourcing. In: AAAI, pp. 522–529 (2014)Google Scholar
- 24.Si, M., Marsella, S.C., Pynadath, D.V.: Thespian: an architecture for interactive pedagogical drama. In: Proceedings of the 2005 Conference on AIED, pp. 595–602 (2005)Google Scholar