While most work in this section is from the game live-streaming domain, we do not restrict ourselves only to this context, to present a more holistic view. Structurally, first, we will provide research that highlights the importance of interactivity, and forms thereof. Afterward, work is presented that highlights individual differences and motivations for why streamers stream and viewers consume this content. Finally, we discuss other approaches, not restricted to the domain of live-streams that aim at classifying users, as well as the advantages of doing so.
The role of interactivity
Offering interactivity was shown to be important across live-streaming contexts. Tang et al. (2016) investigated mobile streaming apps, and many of the activities that were found to happen there had an interactive nature between streamer and viewers. Hu et al. (2017) also highlighted how important such interactivity is, as audience participation options help to enhance the feeling of belonging to the group. Haimson and Tang (2017) focused on live-streamed events broadcast on Facebook, Periscope and Snapchat and also found that interaction is one of the key components to make remote viewing engaging. Even here, streamers are inclined not only to respond to chat comments but also to allow their viewers to alter how the stream proceeds: aspects that we also see in game streams (Lessel et al. 2018). As Li et al. (2020) found, interaction in live-streams is a social activity as well. Watching gaming in general is a key component of play, be it co-located or distributed (Smith et al. 2013) and Tekin and Reeves (2017) also pinpoint (in the context of co-located play) that spectating is more than someone watching others playing games, as spectators start to interact, for example through reflecting on the past play or wanting to coach the player, which again can be seen as form of interaction.
Overall, these aspects highlight that interactivity is important and inherent to the live-streaming experience. Works such as Lessel et al. (2018) show that the forms of interactivity can be seen as broad, and already start with streamers simply answering viewer questions or acknowledging the presence of viewers. Given the importance of interactivity, the broad range of options to realize interactive features and integrative behavior, and works such as Flores-Saviaga et al. (2019) and Deng et al. (2015) showing that there are highly frequented streams with thousands or ten thousands of viewers at the same time, it seems obvious that individual differences in preferences and perception of the available options should be understood and potentially accounted for. Investigating viewer types can contribute to this.
Understanding and enhancing interactivity in live-streams
For this section, we cluster research in respect to interactivity into “understanding (and improving) individual interactive features,” “investigating new streaming platforms,” and “investigating new interactive content.”
Research done in the first cluster either analyzes features currently in use, investigates how these can be improved or altered with new content, or provides and studies novel elements. We will give an example for every aspect.
Much work has been done to understand how the text chat, the main communication channel in live-streams between viewers and streamers (Lessel et al. 2017b), is used. Works such as Hamilton et al. (2014), Olejniczak (2015), Musabirov et al. (2018a, 2018b) and Ford et al. (2017) investigated how it is used and how it changes in relation to the number of viewers: Hamilton et al. (2014) found that chats with more than 150 viewers are hard to maintain; this is compared to the roar of a crowd in a stadium, dramatically changing the interaction options as well as the possible communications between streamer and viewer. Follow-up work (Musabirov et al. 2018a, b) compared the situation to a sports bar, where communication is still possible and viewers could switch between roaring and talking. Ford et al. (2017) names the communication “crowdspeak”; communication that seems chaotic and meaningless, but still makes sense to participants and makes massive chats legible and compelling. Olejniczak (2015) analyzed chat messages of different-sized channels (1k, 10k, 50k, 150k viewers) and found that the nature of the chat changes, for example, messages are visible longest in the 1k case, and emoticons are used most in the 150k case. These examples show that contextual factors—here, the number of viewers in a channel—have a significant impact on how an offered option—here the primary communication channel—is used. Similar findings were reported by Flores-Saviaga et al. (2019).
As an example of how an identified drawback can be targeted, we want to highlight the work of Miller et al. (2017). They suggested an approach to overcome the information overload of the chat through the usage of conversational circles, in which viewers of a live-stream are dynamically partitioned and only see messages of other viewers in the same partition. Through a message upvoting mechanism, it is possible that messages from one partition are shown in others or even to all viewers. In a user study, the authors found that viewers appreciate such an approach and that more messages can be handled.
Another commonly used type of feature in streams is overlays, for example showing the latest follower, or what kind of background music is currently playing, directly in the video stream. In the work of Robinson et al. (2017), it was investigated how viewers perceive receiving “internal” information about the streamer, i.e., their heart rate, skin conductivity and emotions, presented in the form of such a video overlay. In their study, this was found to impact the viewer engagement, enjoyment and connection to the streamer positively, but was also distracting to a certain extent. While again the context (e.g., the game situation being streamed) might have had an impact on this perception, it might also be the case that such pieces of information are problematic for certain viewer types; something that might be better understood if a classification tool, as aimed for in this paper, were available.
An example of a new feature that is introduced can be seen in the prototype Helpstone (Lessel et al. 2017b). With it, a specific round-based trading card game (Hearthstone by Blizzard Entertainment) was in the focus, and the authors wanted to investigate how communication channels could be improved. The streaming situation of this game is often characterized by viewers giving suggestions or hints for the current game state; these become harder to process in larger channels (see above). To mitigate this, we proposed, among other features, a direct interaction option in which viewers can directly interact with the streamed video and give move suggestions similarly as they would carry out moves in the game. These are then automatically aggregated by the system and shown to the streamer directly in the game itself with arrows and a number indicating how many viewers suggested this move. In a small user study, we found that this direct interaction is appreciated by the streamer and viewers. An interesting finding here was that while every viewer appeared to like this feature, not everyone would actively use it: Viewers apparently can be partitioned into two classes, i.e., active and passive viewers, hinting that there appear to be at least two coarse-grained classes. Similarly, in a large-scale survey, Gros et al. (2018) found that not all viewers want to be involved in a stream, giving further support for these two classes. They also investigated factors having an impact on viewers’ involvement desire, and were able to identify factors, such as age. This further supports the supposition that a viewer-type instrument could find other differences and subsequently could serve as an additional factor to consider in such studies.
In the second cluster, works are characterized by adding fundamentally new options to the streaming platform itself, or by the investigation of new platforms. Already in 2017, Twitch released a functionality called “Twitch Extensions”Footnote 3 that allowed streamers to build custom features that could directly be integrated into their channel and allow viewers to interact with it. Besides the option to build such extensions from scratch by coding them, extensions made by others could also be integrated into one’s own streaming channel through a marketplace. This option increased the possible interaction and integration space, but at the same time also calls for systematic guidelines on which possibilities are reasonable for viewers: something viewer types could help with. With Rivulet, Hamilton et al. (2016) investigated how multiple streams coming from different streamers, but, for example, from the same event, could be combined to allow viewers to get a holistic view. The platform that they provided prominently showcased one of the streams, with the other streams shown in smaller windows in parallel, and with viewers being able to select which stream should be showcased. A main chat combined the individual streams’ chats, and as additional features, push-to-talk voice messages were possible, as well as heart emoticons, as a quick form of feedback. A follow-up work by Tang et al. (2017) found that a multi-streaming approach is highly dependent on its context when it comes to viewer interactions, as well as the viewers themselves (voluntary viewers had a wider variation in, for example, how long they watched, or had less interactions through the text chat compared to viewers recruited via Amazon Mechanical Turk). Again, this indicated that different motivations exist, and these can directly impact the interactions with the streams and the offered interactions.
Research of the third cluster focuses on allowing viewers to directly impact the underlying content that is streamed. In the context of games, this means allowing the audience to impact how the game the streamer is playing proceeds, or giving (at least some) viewers the option to directly participate in the game. Both of these aspects are not only considered in research, but also done in typical streams today. For example, streamers use polls to let viewers decide what they should play or do next, or offer to play against some of them in multiplayer games (Lessel et al. 2018). Commercial games like Choice ChamberFootnote 4 even allow the audience to directly impact, through periodical polls, what happens for the streamer in the game (e.g., which enemies appear).
Also, a special form of streams without a streamer have appeared, in which the audience alone controlled the game that was broadcasted. “Twitch Plays Pokémon” is a prominent representative of this experience: the game Pokémon Red was streamed and the game’s avatar could be controlled by viewers through entering special commands into the chat, without any form of moderation. At the peak, 121,000 people played the game simultaneouslyFootnote 5 and in the beginning, game commands entered were executed automatically, leading to game situations that delayed progress for hours (e.g., because the game’s avatar moved one step left and then right for hours). Nonetheless, the game was completed by the audience in under 17 days. During the game, a plurality voting mechanism was introduced, in which commands were aggregated and only the command most often provided in a given time frame would be executed. This voting mode was not always active: the viewers themselves decided (again through voting) which command mode was active. Scientific work such as that of Ramirez et al. (2014) analyzed the experience and found that viewers were not uniform and perceived this experience differently: for example, some liked the chaos associated with the execution of all entered commands, while others wanted the game to proceed faster. As we will illustrate below, these differences might be explainable by player types in games, but viewer types, if they exist, might have also had an effect. We conducted two case studies in this third cluster (Lessel et al. 2017a). In the first study, they investigated a popular tabletop roleplaying stream and analyzed what kind of options the streamers provide their audience to let them impact what happens. In the second study, they also investigated a “Twitch Plays Pokémon” setting in a laboratory study with more features compared to the original run (e.g., more voting modes). In both studies, the results highlight individual viewer differences, as, for example, not every viewer participated in the polls (first study), and some viewers focused on social actions instead of game-related actions (second study).
In sum, although it was not their actual research focus, research on interactivity already provided indications that viewers behave differently, thus making it reasonable to assume that viewer types could be identified and measured, supporting this line of research further.
Motivation and individual differences in live-streams
In this section, we will focus on research that aims to understand the viewer’s motivations in the context of game streams.
Wohn and Freeman (2020) conducted qualitative interviews with streamers (with varying channel sizes) and with an interest in how they perceive their audience. It became clear that streamers can identify different classes of viewers, such as viewers who feel they are part of the family, or who are fans, trolls or lurkers. To identify their audience, they actively probed by asking questions and remembering certain viewers over the course of the stream. Cheung and Huang (2011) focused on the popular real-time strategy game Starcraft (by Blizzard Entertainment). By analyzing online sources of viewers who shared their story of spectating the game, they identified nine different personas, ranging from ones like The Uninformed Bystander (a person who is merely watching by coincidence and does not know anything about the game he or she watches), to The Pupil (a person who watches to understand and learn the game better) to the The Assistant (a person who wants to help the player). The authors highlighted that a spectator can have multiple personas at once. While this investigation was not strictly bound to interactive live-streaming, it already indicates, with the nine personas, that different preferences and motivations apparently exist, although it did not provide a way to measure these specifically. We conducted an online questionnaire for viewers of game live-streams and let them rate a broad range of features and streamers’ behaviors in the context of these streams (Lessel et al. 2018). It became obvious that many elements were only perceived well by subsets of participants, again indicating individual differences. Based on the personas identified by Cheung and Huang, statements (one per persona) were presented through which participants were to indicate their motivations for why they watch live-streams. The answers to these statements were set in relation to the element ratings, and it was found that the absence or presence of a certain motivation has an impact on the ratings. We stated that more work should be invested in the derivation of viewer types, as our approach only revealed that there are individual differences with an impact on the perception of elements. Another classification was presented by Seering et al. (2017). The authors investigated their own audience participation games (games like the mentioned Choice Chamber) and from post-session survey data, they derived archetypes: Helpers (who want to help the streamer to reach his/her outcome), Power Seekers (who want to have impact on the game, regardless of whether it helps or hinders the streamer), Collaborators (who want to collaborate with other viewers and with the streamer, regardless of the outcome), Solipsists (who focus on obtaining personal benefits, e.g., learning or meeting new people) and Trolls (who want to hinder the streamer). While it is again interesting to see different behaviors, no direct options to assess these archetypes were provided. Yu et al. (2020), on the other hand, investigated whether there are relationships or trends between the perception of features enabling interaction with streamers and Hexad user types (which were originally developed to explain user behavior in gamified settings by Tondello et al. (2016)). The authors conducted an online survey with 50 participants, in which the Hexad user type was assessed, as well as preferences regarding eight different interactive features frequently used in game live-streams. Their results show that there are certain relationships, such as Socializers preferring affiliation features and chat input. While this is valuable work for identifying individual differences together with a way to assess them, in contrast to Yu et al. (2020), we argue that the experience of consuming game live-streams is very different from directly interacting with a gameful system (which is the scope of the Hexad model used). Consequently, to explain preferences for game live-stream features, a specialized viewer-type questionnaire seems worthwhile, instead of using a model developed for gamified applications.
In another line of work, motivations of viewers were focused directly. The work of Wohn et al. (2018) is an example that aimed to understand why viewers donate money to a streamer. They found different motivations for this, such as the goal to improve the content, or to pay for the entertainment value. Sjöblom and Hamari (2017) wanted to understand why people become viewers of game streams and whether it could be predicted how much people will watch, and how many streamers they will follow and subscribe to. They found positive relationships between motivations from the uses and gratification perspective and the aforementioned variables. Kordyaka et al. (2020) were also interested in finding technological and social variables that would best describe motivations of users for consuming game live-streams using the Affective Disposition Theory and the Uses and Gratification Theory, based on an online questionnaire in which participants were asked to provide answers to questions in relation to, for example, their consumption behavior. They propose a unified model, in which the perception of identification with and liking of a streamer and interactivity predict how much a stream is consumed, with interactivity as the most important predictor. While the latter part is in line with the work presented above, the paper does not discuss individual differences, something that we aim for with the viewer types. Gros et al. (2017) also used a questionnaire approach to find out the motivations for why Twitch is consumed (in the main categories entertainment, socialization and information), with statements relating to entertainment receiving the highest values. The authors also found differences in answers to statements relating to socialization aspects, depending on whether users had already donated to the streamer or not; i.e., those that have a stronger social motivation are more likely to donate, further highlighting individual differences and the existence of a relationship between motivation and actual viewers’ actions. Sjöblom et al. (2017) also investigated contextual factors (e.g., which games are played and which presentation form is used) and here also found a strong impact, in that “particular stream types and game genres serve to gratify specific needs of users”, as well as individual differences depending on the form of gratification sought (e.g., social and personal integration vs. seeking tension release). Hilvert-Bruce et al. (2018) also wanted to understand why viewers engage (such as by subscribing or donating) in live-streams. They found that six of eight motivations (such as sense of community or the desire to meet new people) indeed influence the willingness to engage, and also that the channel size (as a contextual factor) has an impact. Seering et al. (2020) analyzed 183 million messages from Twitch streams and found differences in the behavior of first-time participants vs. regulars (e.g., the former write shorter messages, ask more questions and would engage less broadly with others of the community). In addition, some indications were found that the context also has an impact on the subsequent behavior of the former (e.g., whether moderators had interacted before). While giving further examples of contextual differences, this paper also shows that individual factors that are less persistent (here: amount of stream content consumed in the past) compared to viewer types or personality (which is assumed to change more slowly over time: see Roberts et al. (2006)), might also have an impact on how viewers behave, and are thus another factor to be considered when viewer behavior should be understood.
All these works show existing relationships between motivational structures that impact the viewers’ behavior on the level of, for example, how long they watch a stream. Additionally, some classifications were suggested that were derived through various means. However, a systematic way to assess viewer classes specifically has yet not been provided, to our knowledge. In this work, we not only want to shed light on which viewer types exist, but also how to measure these, and whether we can use them to explain differences in the perceptions of features or streamers’ behaviors.
User-type classifications in game-related contexts
Understanding how and why users interact with games and gameful systems is considered fundamental to improving the users’ experience (Tondello et al. 2019; Hamari and Tuunanen 2014). Consequently, substantial efforts have been made to categorize users into certain types—based either on their motivations and needs or on how they behave in games and gameful systems (Hamari and Tuunanen 2014). We will elaborate on these approaches briefly to show how these classifications were derived. One of the first attempts to classify players in Multi-User Dungeons (“MUD”) was proposed by Bartle (1996). The typology was established by analyzing bulletin-board postings referring to a question asking what players want out of a MUD. As a result, two dimensions arose—action vs. interaction, player orientation vs. world orientation—along which playing can be categorized. Within these dimensions, four player types were established: Achievers, Explorers, Killers and Socializers. When it comes to assessing and using these player types practically, numerous issues have been identified (Bateman et al. 2011). One criticism is related to the fact that the typology is based on motivations and preferences of MUD players, which limits its generalizability to other games or gameful settings (Bateman et al. 2011). Also, the player typology has never been empirically validated, which poses a severe threat to using it for scientific purposes (Bateman et al. 2011; Busch et al. 2016). To tackle this issue, Yee (2002) conducted empirical studies about player motivations. These were based on Bartle’s player types, i.e., Yee brainstormed potential motivations of players guided by the work of Bartle, and came up with statements to rate them. Yee used factor analysis to validate five motivational factors in this first article (Yee 2002). In follow-up works, Yee (2007) empirically derived three main factors which motivate players of online games, namely Achievement, Social Factors and Immersion. Although these empirical studies allow one to assess the motivations of online players reliably, the limited focus on MUDs and thus the lack of generalizability still persists (Busch et al. 2016). Instead of relying on motivations of players, the BrainHex (Nacke et al. 2014) model is based on a series of demographic game design studies and neurobiological research. It presents seven player types, such as the Seeker (motivated by curiosity) or Daredevil (motivated by excitement). In a survey with more than 50,000 participants, Brain Hex archetypes were assessed using textual descriptions that the authors created. However, an instrument to assess BrainHex archetypes has never been validated, and other researchers have demonstrated substantial flaws in the validity of the model (Busch et al. 2016; Tondello et al. 2019). Therefore, the BrainHex scale cannot reliably be used to classify player preferences.
In a more recent work, Tondello et al. (2018b) analyzed the dataset from the BrainHex survey and found support for only three out of seven BrainHex archetypes: action orientation (represented by the Conqueror and Daredevil archetypes), aesthetic orientation (represented by the Socializer and Seeker archetypes) and goal orientation (represented by the Mastermind, Achiever, and Survivor archetypes). Based on the work by Yee (2002, 2007) and on the literature review by Hamari and Tuunanen (2014), the authors suggested to add social orientation and immersion orientation as additional factors. In a follow-up work by Tondello et al. (2019), this five-factor model, and a scale to assess the five factors, was empirically validated. To generate items for each of the five proposed player traits, the researchers used a brainstorming approach, after which each member of the research team wrote several suggested items that could be used to score someone on that trait. Afterward, the suggested items were discussed and the best items were selected.
Instead of focusing on games, the Hexad user-types model targets gamification (as already stated above). It was initially proposed by Marczewski (2015). In contrast to previous models, which are mostly based on empirical studies and observations, the Hexad model is based on Self Determination Theory (“SDT”) (Ryan and Deci 2000). The Hexad model consists of six user types, which differ in the degree to which they are driven by their needs for autonomy, competence, purpose and relatedness, as defined by SDT. For example, Philanthropists are socially minded; they share knowledge with others and are driven by purpose. Tondello et al. (2016) created a survey to assess Hexad user types. As a first step, an expert workshop with six experts was held to generate a pool of items for each of the user types, which were agreed upon by discussions (Tondello et al. 2016; Diamond et al. 2015). Next, the first version of the Hexad scale was introduced and the instrument’s ability to explain user preferences for gameful design elements was demonstrated (Tondello et al. 2016). More recently, the scale was slightly adapted and its validity was empirically demonstrated (Tondello et al. 2018a). Högberg et al. (2019) also contributed to the field of user modeling in gamified systems by proposing and validating a questionnaire to model and assess gameful experiences of a gamified system. Like us, they followed an empirical approach, starting with a pre-study to inform the item generation process, creating and refining items for the questionnaire and validating it in a validation study.
To sum up, previous work shows that player or user typologies in the context of games and gameful settings have been under investigation for more than 2 decades, and this is still an active research field. While the aforementioned studies focused on the player or user actively engaging and interacting with a game or gameful system, we shift our focus to viewers or spectators of such games, to account for the increasing popularity of game live-streams. We believe that—although the content of such live-streams is also games—the experience of spectating someone else playing a game is different from interacting with it directly. Therefore, we contribute a typology of game live-stream viewers, which is based on an empirical study of their motivations and preferences. While this sheds light on the needs and motivations of viewers, we go a step further and contribute an instrument to assess viewer types and investigate which features in game live-streams are particularly relevant for each viewer type, mirroring to a certain extent the approach of Tondello et al. (2016).