1 Introduction

Video games have become a highly visible part of human practice and culture. However, the experiences they evoke are in no way limited to systems and services purposefully designed as games. In fact, during the last few years, gamification has joined such phenomena as artificial intelligence, big data, and crowdsourcing as contemporary megatrends. Gamification refers to the transformation of technology to become more game-like, with the intention of evoking similar positive experiences and motivations that games do (the gameful experience) and affecting user behavior. Systems, services, and organizational structures are increasingly intentionally imbued with game-like qualities (Hamari et al. 2018; Vesa et al. 2017), and gamification has been applied in widely differing contexts, such as commerce, health, sustainability, software development, and research (Seaborn and Fels 2015; Hamari et al. 2014). Within the health context, for example, hunting for Pokémons in Pokémon Go helps promote both physical and social activity for an inactive general population (LeBlanc and Chaput 2017).

However, this ability to create gameful experiences is not limited to games and gamified services. Technological advances have offered ample opportunities for playful and positive experiences to be included in the use of more traditional systems, even though such systems are not designed for that purpose (see, e.g., Webster and Martocchio 1992). Some researchers (e.g., Prensky 2012; Granic et al. 2014; Vesa et al. 2017) have also argued that contemporary people and so-called “digital natives” may be more susceptible to the gameful experience even in “non-game contexts,” which would be a consequence of learning motivational orientations and ways of engaging in activities through playing games that have seeped into everyday life (Prensky 2012; Granic et al. 2014; Vesa et al. 2017). As such, we believe that society is facing a cultural shift powered by the technological development of more gameful experiences in people’s lives and society.

Research into behavior change interventions based on digital services is growing, and the fact that the effectiveness of motivational strategies are dependent on the user makes the personalization of such interventions important (Masthoff et al. 2014). When gamifying such services, the gameful experience afforded by these services drives their effect on behavior (Huotari and Hamari 2017; Seaborn and Fels 2015; Werbach 2014; Landers et al. 2018). Therefore, this experience must be in focus when developing gamified services. This experience is subjective, which means that a service that leads to the creation of gameful experiences for some people will not do so for others (Huotari and Hamari 2017). This subjective nature of the gameful experience means that personalization should be a viable approach to improve the ability to afford such experiences and, as such, improve the effect on the target behavior of gamifying. Consequently, gamification is adapted to the motivational strategy that is effective for a specific user.

If this subjective nature of the gameful experience is to be understood, it is necessary to be able to measure the gameful aspect of individuals’ experiences of service use, both in order to understand the effect of gamification and also to leverage the full potential of such services by facilitating user adaption based on such experiences. Thus, having sought to address the research problem of how to measure the gameful experience across a variety of systems and services, we developed and validated an instrument that measures the gameful experience as a holistic state. First, we conducted an exploratory qualitative study. In a survey with open-ended questions, we investigated users of Duolingo, Nike+ Run Club, and Zombies, Run! and their experiences while using these gamified services. In a second exploratory and quantitative study, we developed the measurement instrument and evaluated its factorial and psychometric properties using data collected among users of Zombies, Run! (N = 371). In a third confirmatory and quantitative study, based on the results of the second study, we improved the instrument, repeated the evaluation of the instrument’s dimensionality and psychometric properties, and utilized confirmatory factor analysis to validate it. Data for the third study was collected from users of Duolingo (N = 507). As a result of this research, we developed the instrument, called GAMEFULQUEST, which can be used to measure and model the individual user’s gameful experience of systems and services.

1.1 Theory

Within the field of games studies, it has been difficult to define games; even to the extent that it can be regarded as an inside joke in the field (Stenros 2017). One established way to define games is to describe them as having a number of necessary features or conditions. In an attempt to make a synthesis of definitions, Juul (2003) described six such conditions: (1) games are based on rules; (2) they have variable outcomes that are quantifiable; (3) different outcomes in a game are assigned different values, both positive and negative; (4) effort must be invested to affect the outcome; (5) the outcome is important to the player; and (6) optionally, games can have real-life consequences. According to Juul (2003), these conditions are sufficient for something to be a game. However, it is not difficult to find instances in which these conditions are present but a game does not emerge—such as work. Work is also based on rules (company policies) and has quantifiable outcomes, such as salary level (Huotari and Hamari 2017; Vesa et al. 2017; Stenros 2017) and the other suggested conditions are equally applicable to work; therefore, following the logic of Juul (2003), work could qualify as a game. However, since most people presumably do not consider work to be a game, something must be missing that is crucial to a game. We argue that a more pronounced experiential component is required when describing what constitutes a game.

As an alternative to this type of systemic definition, games also can be defined from an experiential perspective, or from the perspective of psychology (Huotari and Hamari 2017). Juul (2003) included such an experiential condition in his model when he described a demand for involvement. However, due the great diversity in games that affect the experience of playing them (Ijsselsteijn et al. 2007), we argue that the experiential component requires a more detailed description in order for games to be thoroughly understood. In fact, this diversity among games makes it reasonable to state that games afford no single experience. Instead, a game is recognized only through a combination of different experiences, which underscores the multidimensional aspects of the game experience.

The experience-systemic dichotomy is also prevalent within the field of gamification. Definitions of gamification have focused either on the experiential aspect—that is, the gameful experience (e.g., Huotari and Hamari 2017)—or on the game design; that is, what design can be used when gamifying (e.g., Deterding et al. 2011). This distinction is important because we believe that the emergence of the gameful experience is necessary to reach the intended goal when gamifying. Since the gameful experience acts as a mediator between the motivational affordances of the gamified service and the targeted behavioral outcome (Huotari and Hamari 2017; Landers et al. 2018), there is no point in gamifying if the aim is not to achieve a gameful experience.

It has been suggested that it is beneficial to personalize incentives when gamifying since different people have different motivations (Vassileva 2012). However, because it is the gameful experience that is the driver of the targeted behavioral outcome when gamifying (Huotari and Hamari 2017; Seaborn and Fels 2015; Werbach 2014; Landers et al. 2018), the creation of such experiences should also be a possible subject for personalization. In fact, since the gameful experience is subjective (Huotari and Hamari 2017), user-adapting gamified services should be a valid approach for improving the ability of gamified services to afford such gameful experiences and, as such, improve their ability to change the targeted behavior. Today, there is growing interest in adaptive gamification within the literature (Böckle et al. 2017), although most of this research is theoretical (Tondello and Nacke 2018). For example, Orji et al. (2017) found that personality type, as defined by the five-factor model (Goldberg 1993), affected the effectiveness of different game-based persuasive strategies to motivate users. Orji et al. (2014) found that gamer types, as defined by the GameHex model (Bateman et al. 2011), affected the perceived persuasiveness of such strategies. Adaji and Vassileva (2017) argued that gamified apps can be personalized according to shopping behavior, as defined by the categorization of Moe (2003), to incite healthy shopping behavior. In response to the lack of empirical evidence for the effect of adaptive gamification, such studies are on the way (e.g., Tondello and Nacke 2018).

1.2 The game experience

The effect of gamification on the target behavior relies on the gameful experience that gamified services create (Huotari and Hamari 2017; Seaborn and Fels 2015; Werbach 2014; Landers et al. 2018). However, despite its importance, the gameful experience is not a well-developed concept within gamification research. There are only a few substantial contributions on this construct; and these are recent. For example, Eppmann et al. (2018) developed a model of the gameful experience and a corresponding measure, which are discussed below. Another example is Landers et al. (2018), who formally defined gameful experience through three psychological characteristics that lead to such a gameful experience: (a) perceiving that goals are not trivial and achievable; (b) a desire to pursue these goals, albeit under rules that are limiting and that the user is willing to abide by; and (c) a belief that participation is voluntary. This focus on psychological characteristics that leads to gameful experiences distinguishes Landers et al. (2018) from our work, since we focus on describing the gameful experience per se.

Because of this limited knowledge on the subject, we have turned to digital games and game experience research to more thoroughly understand the gameful experience. Game experience has been defined as “an ensemble made up of the player’s sensations, thoughts, feelings, actions, and meaning-making in a gameplay setting” (Ermi and Mäyrä 2005). The game experience is co-created (Huotari and Hamari 2017; Normann and Ramírez 1993; Vargo and Lusch 2004) in the interaction between the game and the gamer. This means that the gamer actively takes part in its construction (Ermi and Mäyrä 2005; Huotari and Hamari 2017). A game can be experienced during three different phases: (1) the pregame phase, which comprises everything that happens before using a game; (2) the game phase, which includes the actual time the game is used; and (3) the postgame phase, which includes both the time after a single gaming session and the time that stretches beyond this single event—meaning that the effects of repeated gaming are considered (Elson et al. 2014). Several researchers have described the game experience as multidimensional (e.g., Elson et al. 2014; Poels et al. 2007; Takatalo et al. 2010). The next sub-section reviews commonly used dimensions that describe this experience, including an overview of instruments used to measure the game experience and its dimensions (Table 1).

Table 1 Questionnaires used to measure the game experience or dimensions of the game experience

1.2.1 Dimensions of the game experience Playfulness

Games are played—that statement clarifies that play and games are two intrinsically intertwined concepts. Saying that games are played indicates that “playing games” is a subset of play. However, play is also a dimension of games (when playing a game, this partly contains elements of play), so a person can be in a playful state of mind when playing a game (Salen and Zimmerman 2004). PLEX, a conceptualization of playful experiences related to software and games, takes a holistic view of playfulness and consists of 22 categories of experiences (Lucero et al. 2013). Many of these categories overlap with other models describing the game experience. In contrast to this broad conceptualization, playfulness has also has been depicted as a sub-dimension of the experience of playing games (e.g., Takatalo et al. 2010). Affect

Games can be a powerful inducer of emotional states because of the cognitive, emotional, and kinesthetic feedback loop between the game and the player (Calleja 2011). Such emotional states, or affect, have been used to describe the emotional aspects of specific dimensions of the game experience. For example, Brown and Cairns (2004) described an emotional attachment in deeper levels of immersion, and Johnson and Wiles (2003) suggested that experiencing flow when playing games induces positive emotion, which has implications for affective design. The emotional aspects of games are also reflected in the inclusion of positive and negative affect in the holistic measure called the Game Experience Questionnaire (Ijsselsteijn et al. 2008). This inclusion of negative affect is important because it means that the experience of games is described as being partly a negative one. Enjoyment

Enjoyment is a central aspect of how games are experienced (Mekler et al. 2014). In fact, enjoyment is arguably the primary objective of a game, since people would not play if they did not enjoy the experience (Sweetser and Wyeth 2005). Enjoyment may be described as both a dimension and an outcome of the game experience. For example, Poels et al. (2007) listed enjoyment as one of nine dimensions with which to describe the game experience, and the GameFlow model (Sweetser and Wyeth 2005) serves as an example of enjoyment as an outcome of the experience of playing a game. Flow

Flow recurs in descriptions of the game experience (e.g., Poels et al. 2007; Brockmyer et al. 2009; Sweetser and Wyeth 2005; Cowley et al. 2008) and is characterized by intense concentration, altered sense of time, and a sense that action and awareness are merging (Csikszentmihalyi 2014a, b). A person in the state of flow is autotelic; that is, he or she does something for its own sake rather than for an external outcome (Csikszentmihalyi 2014a, b). Flow occurs when activities are performed with a perceived balance between challenge and skill (Csikszentmihalyi 1975). Immersion

Flow is closely related to the construct of immersion, which also is commonly found in the game experience literature (Brockmyer et al. 2009; Brown and Cairns 2004; Cairns et al. 2014; Calleja 2007; Poels et al. 2007; Jennett et al. 2008; Ijsselsteijn et al. 2007). Immersion has been characterized as getting into a cognitive state of being “in the game” (Cairns et al. 2014), in which the gamer experiences being surrounded by another reality that consumes all of his or her attention (Murray 1997). The gamer might also feel isolated from the real world (Patrick et al. 2000). While flow is described as an optimal experience, immersion might include negative experiences, such as negative emotions and anxiety (Jennett et al. 2008). Challenge

Being challenged is necessary for flow to occur (Csikszentmihalyi 1975). Therefore, the experience of being challenged is indirectly part of the game experience, but it is also described as a dimension of the game experience in its own right (e.g., Ijsselsteijn et al. 2008; Malone 1981; Sherry et al. 2006). The feeling of being challenged is also related to achievement, which Yee (2006) found to be one of three overarching motives for playing games. As such, gamers choose games—or levels of difficulty in games—that challenge their abilities and allow them to strive for achievement (Vorderer et al. 2004). Skill

Skill is also indirectly connected to the game experience by its relationship to flow theory (Csikszentmihalyi 1975) and—just like challenge—skill has been used in its own right to conceptualize the game experience. Poels et al. (2007) described competence as an in-game experience of both pride and accomplishment. In addition, as part of self-determination theory (Ryan and Deci 2000), competence has been used to understand the game experience and its relationship to intrinsic motivation (Przybylski et al. 2010; Ryan et al. 2006; Rogers 2017). Competition

Vorderer et al. (2003) noted that challenge is necessary for a game to be enjoyable. However, they described challenging tasks or hindrances as competitive elements, implying that the gamer is engaged in a competition with the game per se. Competition may also be induced by the social situation of competing against an opponent, either real or computer-controlled (Vorderer et al. 2003). Others (e.g., Yee 2006; Sherry et al. 2006) have also acknowledged these competitive aspects of how games are experienced. Social experience

Competition can be induced by a social situation (Vorderer et al. 2003) and is, therefore, partially a social experience. However, the social experience of games can take on different forms, such as socializing, relationship formation, and teamwork (Yee 2006). For example, Rogers (2017) found evidence of feelings of connectedness to other people when playing games. However, a social experience does not need to stem from the presence of real people. In fact, social presence has been described as a state in which a gamer experiences virtual social actors as actual ones (Lee 2004). Presence

In addition to the social presence described by Lee (2004), another category that has been used to illustrate the game experience is presence. Presence has been described as an illusion of non-mediation or, more simply put, as a sense of being in a computer-generated world instead of using a computer (Lombard and Ditton 1997; Ermi and Mäyrä 2005). For presence to occur, the game must allow gamers to represent themselves in the game. An example of this type of game is a first-person shooter (Cairns et al. 2014). As an illusion of non-mediation, presence will only occur if the gamer fails to acknowledge the medium (Lombard and Ditton 1997); this means that the sensory experience must support such lack of acknowledgement. Sensory experience

As Wyeth et al. (2012) pointed out, it is important to understand the relationship between sensory experience and presences (in addition to other dimensions of the game experience). Many researchers have included such sensory experiences in their descriptions, even though the sensory experiences that are included vary. Some authors only include visuals (e.g., Wiebe et al. 2014). Visuals seem always to be represented when audio is included (e.g., Calvillo-Gámez et al. 2010; Ermi and Mäyrä 2005), and both audio and visuals are included when touch is represented (e.g., El Saddik 2007; Witmer and Singer 1998). Thus, these variations are systematic.

1.3 Distinguishing gameful experience from game experience

Most of the reviewed dimensions of the game experiences discussed above are mentioned in the gamification literature; for example, flow (Hamari and Koivisto 2014), playfulness (Hamari and Koivisto 2015), and challenge (Hildebrand et al. 2014). However, there are several differences between games and gamified services, which make instruments and models of the game experience inadequate for the gameful experience.

Systems have traditionally been categorized as either utilitarian (Davis 1989; van der Heijden 2004) or hedonic (van der Heijden 2004). As such, games have functions that are implemented for hedonic purposes (van der Heijden 2004). However, this dichotomy does not apply to gamified services since they have functions implemented for both utilitarian and for hedonic purposes (Hamari and Koivisto 2015). In addition, gamified services aim to intrinsically motivate a target behavior (Hamari et al. 2014; Mora et al. 2015; Huotari and Hamari 2017; Rigby 2015; Seaborn and Fels 2015). Therefore, gamification ultimately aims to change behaviors that have consequences beyond the service per se, such as exercising. This is reflected in some definitions of gamification, which claim that gamification relates to non-game contexts (Seaborn and Fels 2015; Deterding et al. 2011)—even though this view of gamification as necessarily happening in non-game contexts has been criticized (see, Huotari and Hamari 2017 for discussion). Thus, while games have hedonic functions and goals, gamified services also include utilitarian functions, and utilitarian goals beyond the service use. This means that some facets of the gameful experience might directly support the goal of a gamified service, but not that of a game. This renders such facets more salient to the users of gamified services, which means that a model detailing such facets for the game experience will not be adequate for the gameful experience.

This focus on the target behavior has additional implications. Skill has been described as part of the game experience (Przybylski et al. 2010; Ryan et al. 2006; Rogers 2017). However, for gamified services, this experience will be affected by the skill of the target behavior and not just by the skill of the game. This means that experiences that are associated with the skill of the user (like challenge and flow) will also be affected by the target behavior. Additionally, when the target behavior is not strongly associated with the service use per se (for example, a gamified service promoting exercising that is only used between exercises), the gameful experiences needs to extend beyond the game phase and into the post-game phase to motivate the target behavior (see Elson et al. (2014) regarding game phases). Finally, for most gamification implementations, it is not possible to create the same type of immersive sensory experiences as is possible with games (Hamari and Koivisto 2014). Therefore, it seems safe to assume that, for instance, presence—which only occurs if the user of a service fails to acknowledge the medium (Lombard and Ditton 1997)—will not be a facet of using a gamified service.

In sum, this means that there are essential differences between games and gamified services, making models and measures from games research inadequate for use within gamification research. This inadequacy is corroborated by our overview of game experience measures found in Table 1. Among the game experience measures, only CEGEQ (Calvillo-Gámez et al. 2010) and GEQ (Ijsselsteijn et al. 2008) aim to holistically describe the game experience—making them comparable to our scope. Unfortunately, there is no published peer-reviewed psychometric validation of the GEQ and there have been problems replicating the suggested factor structure for it (Law et al. 2018). There also seems to be a lack of psychometric validation for CEGEQ—except for Cronbach’s alpha of the full questionnaire found in Calvillo-Gámez et al. (2010). In addition, most of the dimensions of CEGEQ do not correspond to commonly found constructs within game experience, gamification, or psychology research. Many of the instruments presented in Table 1 are trait measures and not experience measures; for example, Brockmyer et al. (2009) measured the tendency to engage in video games. Finally, many of these instrument include specific items that will not work well for services that are not games; for example “I was interested in the game’s story” from the Game Experience Questionnaire (Ijsselsteijn et al. 2008). Therefore, there are several reasons related to the existing game experience measures, in addition to the conceptual ones, that enable us to conclude that there is a need for a model and an instrument specifically developed for the gameful experience.

1.4 The gameful experience

Even though game experience models and instruments are inadequate for depicting the gameful experience, they are still useful for gameful experience research due to the inherent relationship between games and gamified services. Therefore, in line with our review of (a) the game experience, (b) the dimensions of the game experience, and (c) gamification, we view the gameful experience as co-created and multidimensional. The gameful experience may occur, but does not have to, when a user of a service interacts or has interacted with intentionally or unintentionally implemented motivational affordances (for gameful experiences). The goal of creating such gameful experiences is to spur motivation for both continued service use and for a targeted behavior. Therefore, the intended effect of a gamified service stretches beyond the game phase and into the postgame phase.

A model and a measure (GAMEX) for the gameful experience were only developed recently (Eppmann et al. 2018). GAMEX includes six dimensions: enjoyment, absorption, creative thinking, activation, absence of negative affect, and dominance. Three of these (activation, absence of negative affect, and dominance) are commonly found in descriptions of affect (e.g., Russell 1980; Mehrabian and Russell 1974). Emotions have been described as reflecting a wisdom of ages (Lazarus 1991), where such emotions are superordinate programs that coordinate behavior to be functional (from an evolutionary point of view) (Cosmides and Tooby 2000). This means that these emotional states relate to experiences on a decidedly general level. In addition, GAMEX includes an enjoyment dimension. Even though enjoyment is an important aspect of gamified services, it is also a general concept. One could even argue that enjoyment is too general to be a truly meaningful descriptor of the unique concept of gameful experience (see, e.g., Cairns et al. 2014)—an argument that could also hold for affect.

When developing GAMEX, Eppmann et al. (2018) created an item list that was extracted from 22 papers describing measures of the game experience or other constructs that the authors found relevant; from this list of items, a model was extracted using exploratory factor analysis. Eppmann et al. (2018) argued that affect and enjoyment are both important aspects of the gameful experience and included such items in their initial item pool. In our estimation, this decision is the cause of this general focus of GAMEX. Even though we agree with Eppmann et al. (2018) on the importance of both affect and of enjoyment for gamification, we also believe that they could be treated as outcomes of the gameful experience, rather than dimensions of it. As such, they could be measured using existing instruments such as Intrinsic Motivation InventoryFootnote 1 to measure enjoyment, and PAD (Mehrabian and Russell 1974) or PANAS (Watson et al. 1988) to measure affect.

Finally, the gameful experience needs to be distinguished from user experience and user engagement, both of which are concepts that aim to describe the experience of using software and technology. While the traditional focus of user experience is mainly as a matter of usability (Ijsselsteijn et al. 2007; Wright and Blythe 2007), newer streams of research conceptualize the user experience more holistically and include hedonic aspects (e.g. Hassenzahl and Tractinsky 2006; Hassenzahl 2008). User engagement is a quality of this user experience that can be described using hedonic attributes such as interest, challenge, and positive affect (O’Brien and Toms 2008). Therefore, since both user engagement and later streams of user experience focus on hedonic experiences, they overlap with the gameful experience. In fact, while the user experience and user engagement relates to experiences of services or systems in their entirety, the gameful experience is created specifically in response to interacting with affordances for gameful experiences (Huotari and Hamari 2017). Therefore, it is an experience that emerges from the interaction with the game aspects of such systems or services and will lead to engagement, both with the usage of the service per se and with the target behavior.

2 Study 1: Dimensions of the gameful experience

2.1 Method

As a foundation for this research, we used the process for scale development described by DeVellis (2012). We took the following steps: (1) determine what to measure, (2) generate an item pool, (3) determine the measurement format, (4) expert review of the item pool, (5) administer the item pool to a development sample, (6) evaluate the items, and (7) optimize the scale length. In Study 1, we aimed to determine what to measure. To do this, we identified and described the dimensions that constitute the gameful experience using a qualitative approach. We also wanted qualitative data to inform future item generation. We used three surveys with open-ended questions in which respondents reported their experiences regarding various game elements found in three gamified services: Zombies, Run!,Footnote 2 DuolingoFootnote 3 and Nike+ Run Club.Footnote 4 Duolingo is a gamified service that focuses on motivating language study, while both Zombies, Run! and Nike+ Run Club are gamified services that aim to motivate users to run. A distinguishing feature of Zombies, Run! is its strong focus on a story, in which the user is a runner scavenging for supplies and taking on missions in a post-apocalyptic world.

2.1.1 Participants

We recruited convenience samples from Microworkers.com, an Internet-based service through which workers are paid to complete small tasks, such as filling out questionnaires. Participants earned US$3 for completing a survey. Using this type of service for sampling is considered appropriate (Buhrmester et al. 2011; Paolacci and Chandler 2014), and reliability has been found to be satisfactory among MTurk users (a competing service) (Buhrmester et al. 2011; Shapiro et al. 2013). Studies have also shown consistency of answers over time (Shapiro et al. 2013; Mason and Suri 2012; Rand 2012; Holden et al. 2013), which indicates the workers’ honesty. To ensure that respondents had at least a moderate experience of the service, we applied a screening question.

The surveys were answered by 187 respondents, 130 of whom completed the surveys in full (male: 58%; age: M = 27). Fifty-nine respondents completed the survey regarding Zombies, Run!, 31 regarding Duolingo, and 40 regarding Nike+ Run Club. The drop-out rates were similar in all groups. The surveys were only accessible to respondents from the USA (68%), the United Kingdom (16%), Canada (9%), Australia (6%), and New Zealand (1%) since the open-ended questions required extensive proficiency in English.

2.1.2 Materials

We used SurveyMonkey to design and distribute the surveys. Respondents were asked to describe their experiences in relation to different game elements found in Nike+ Run Club, Duolingo, and Zombies, Run! We selected these services since they, together, include all 10 categories of motivational affordances used when gamifying found in earlier research on gamification by Hamari et al. (2014). By including all these categories, we aimed to cover enough game design elements that afford sufficient scope of different experiences to create a model of the gameful experience that is generalizable to other gamified services. These categories include points, leaderboards, achievements/badges, levels, stories/themes, clear goals, feedback, rewards, progress, and challenges (Hamari et al. 2014).

Samples of these types of affordances were chosen from among the investigated services. For example, a template question was, “When thinking of the feature [motivational affordance], what are your experiences if you would look at [service] as a game?” A picture of respective motivational affordance was presented with the question. In a pretest, a decline in the level of detail of responses was observed for later items in the questionnaires. Therefore, to receive an equal amount of information for each item and to avoid order-effect bias (Perreault 1975), the game elements were presented in random order.

2.1.3 Procedure

Data for the different services was collected on three different occasions. The study was published on Microworkers.com and the participants were informed about (a) the aim of the study, (b) the expected workload, (c) the need to have at least a moderate level of knowledge of the service, and (d) the $3 compensation for completing the survey. A link to the survey was provided and a screening question ensured a sufficient level of knowledge of the service. Attempts to complete the survey more than once on the same device were blocked. The participants received the $3 compensation through Microworkers.com after completing the survey.

2.2 Analysis and results

We used thematic analysis to analyze the survey responses, with previous research on game experience dimensions as a guide (see Sect. 1.2 and Table 1). However, since this study was about gameful experiences, we were open to finding dimensions that had not been discussed in previous games research; therefore, our approach was both deductive and inductive. The analysis was executed using NVivo 11. We followed the process described by Braun and Clarke (2006), which includes the following five steps. First, we read the material several times in order to become familiar with the data, to identify existing (among game experience dimensions) themes or new themes, and to generate ideas for coding. Second, we generated 47 nodes into which the data was coded. Third, we deductively arranged the nodes into themes when applicable; if a node did not fit within an existing theme, a new one was created. Fourth, we reviewed the themes and evaluated the extent to which the data extracts constituted a coherent theme and the extent to which the themes were reflected in the data extracts (See Table 2 for examples of responses by theme). Finally, as part of the fifth step of the thematic analysis and for the purpose of this article, we developed definitions for each dimension iteratively throughout the thematic analysis, informed both by the qualitative survey data and earlier digital games research (Sect. 1.2 and Table 1). These dimensions and definitions are presented in Table 3. The analysis was an iterative process and steps 2 to 5 were repeated multiple times. The analysis resulted in seven themes, which are presented below.

Table 2 Examples of responses by theme
Table 3 Dimensions derived from the analysis of the qualitative data and the review of the game experience of digital games

2.2.1 Accomplishment

Participants commonly described having a feeling of accomplishment, which was reported to be related to goals and to completed tasks created by the service. Taking something to completion, whether it was a task or a goal, seemed to be part of a drive to progress and a willingness to always improve. The accomplishments could be related to the service, but also to the real world, such as running more or being healthier.

2.2.2 Challenge

Participants reported that obstacles were both fun and motivating. Progressive skill-building was described as necessary to take on such obstacles. Challenges were related to the difficulty of a task; that is, the challenge originating from the task being difficult. As respondents progressed, obstacles were described as being increasingly difficult, which maintained the challenge. Participants reported that the challenges were induced by the users themselves and by the service. The challenges were described as a test of the user’s ability.

2.2.3 Competition

Participants commonly reported a feeling of competitiveness, based both on unspecified competitive aspects of the service and on there being winners among users. They described feeling a sense of pride related to others and some mentioned the term “bragging rights”. Some participants said competitiveness was motivating, but others found it demotivating, depending on whether they were competitive by nature. Participants mentioned having feelings of competitiveness towards different types of actors, including themselves, the service per se, and other people. In the latter case, the friends sub-group was mentioned often.

2.2.4 Guided

Some participants stated that they felt guided by the service, including being helped with (a) sticking to a plan; (b) structuring work, such as breaking tasks into smaller elements; and (c) getting feedback on their performance. Participants said this guidance could be at the task level (how to do better on a specific task) or at the general level (feedback on the users’ progress toward their goals).

2.2.5 Immersion

Participants described using the services as an immersive experience and, as an example, had emotional reactions to a story depicted by the service as if it occurred in the real world. Some participants also reported a change in their perceptions of the real world, such as time passing quickly or a targeted behavior becoming less effortful because the service acted as a distractor and grabbed the users’ attention. Some said they needed this diversion in order to cope with the target behavior.

2.2.6 Playfulness

Participants described using the service as pleasurable because they were able to create things, leaving room for imagination and creativity. Spontaneity was mentioned as an important aspect of games. Participants also mentioned explorative aspects, such as new venues opening up after achieving a certain level. One participant even used the word “mystery”. Some participants felt that the actions demanded by the app should be voluntary, and one participant said that compulsory actions would reduce the probability of completing the action.

2.2.7 Social experience

Participants said that the presence of other people was enough to invoke social experiences, such as feeling accountability when other people observe whether a goal is achieved. Some participants also reported having received support from others and being energized through friends’ encouragement. However, these social experiences did not always occur through the specific service, but could emanate from users’ participation in other services, such as Facebook, or from users being inspired to participate in activities with others in the physical world. The services also seemed to be able to create social experiences without the presence of real people.

2.3 Discussion

In Study 1, we set out to find and describe dimensions that constitute the gameful experience. Our main finding was a model that includes seven dimensions: accomplishment, challenge, competition, guided, immersion, playfulness, and social experiences.

Based on the review of instruments used to measure the game experience (Table 1), we conclude that immersion is one of the most commonly used constructs when describing the game experience (if closely related concepts such as flow, focused attention, and involvement are included). We also found evidence that there is an immersion dimension for the gameful experience. This finding is corroborated by Eppmann et al. (2018), who included the dimension of absorption (which is closely related to how we conceptualize immersion in Table 3) in GAMEX. Our model also includes accomplishment, which refers to a demand or drive to perform successfully, progress, and to achieve goals. Interestingly, immersion and accomplishment both seem to reflect a user’s engagement. However, while immersion is a short-term in-game effect, accomplishment also focuses on the engagement in the target behavior. Consequently, the experience of accomplishment will stretch beyond the game phase and into the postgame phase (see Elson et al. 2014); this reflects the thoughts of Bouvier et al. (2014), who stated that engagement might extend beyond the mediated activity. This type of accomplishment dimension can be found within games research in a few models (e.g., Yee 2006), and also as part of flow where clear goals are part of the construct (Jackson and Eklund 2004). However, this construct is missing in GAMEX. In fact, there does not seem to be a construct in GAMEX that reflects on this postgame-phase engagement in the target behavior, which differentiates this model from ours.

Despite the seemingly close relationship between play and games—games are played, after all—playfulness does not commonly occur among the game experience measures. However, there are dimensions of models that include facets of our conceptualization of playfulness; for example, in Sherry et al. (2006), fantasy is related to imagination in our conceptualization, and in Yee (2006), discovery is close to exploration. This is also the case with GAMEX, which incorporates creative thinking (includes items related to both imagination and exploration). Thus, playfulness, as conceptualized in our model, covers a broader spectrum of experiences compared to other instruments used for the game experience and in GAMEX.

Most game experience instruments, including GAMEX, do not include a social dimension. This lack of attention to the social aspects of gaming has been acknowledged (Gajadhar et al. 2008), and could be due to the fact that the social experience can be seen as a secondary aspect of gaming (e.g. Calvillo-Gámez et al. 2010). Furthermore, competition is a social experience and a competition dimension is not commonly found among the game experience measures (Yee (2006) is an exception) or included in GAMEX. Therefore, since our model includes the dimensions of social experience and competition, the social aspects of the gameful experience are comparatively important for our conceptualization.

Guided is the only dimension that, to the best of our knowledge, is not part of the models or measures of the game experience, or part of GAMEX. Feedback and goals are motivational affordances for gameful experiences (Hamari et al. 2014), which reasonably have the ability to offer guidance. While such affordances are part of normal games, such guidance might be less important or salient when playing games, possibly because the utilitarian focus of guidance is not congruent with the hedonic focus when playing such games. This could be a reason why this dimension is part of the gameful experience, but is missing as an experience related to playing games.

Finally, our model contains a challenge dimensions. This dimension is commonly found in game experience measures, and games have even been defined as “the voluntary attempt to overcome unnecessary obstacles” (Suits 1978), which makes challenges intrinsic to games. As such, due to the inherent relationship between games and gamified services, it is unsurprising that challenge is a prevalent topic in gamification literature and studies (e.g. Hamari and Koivisto 2014; Hamari et al. 2014; Hildebrand et al. 2014). Consequently, it is equally unsurprising that challenge was found to be part of the gameful experience in the present study. This dimension is missing in GAMEX.

In sum, our model contains the unique dimension of “guided”. Furthermore, even though many of the dimensions have been used in game experience models and measures, our model contains a unique combination of such dimensions. Our model is also different from GAMEX, where four out of six dimensions are based on more general constructs (see “The gameful experience” section for discussion). In this study, we utilized a combined deductive and inductive approach, where the deductive part was informed by game experience research. We believe that this approach has resulted in a model that has the ability to describe the uniqueness of the gameful experience, while still honoring the knowledge from games research on the game experience.

3 Study 2: Developing the instrument

3.1 Method

The goal of our second, quantitative, study was to develop and test a tentative instrument for measuring the seven dimensions of the gameful experience identified in Study 1. We also aimed to evaluate psychometrics and, if necessary, develop the instrument in order to reach adequate psychometric properties during a subsequent third study. We continued to use the process for scale development described in DeVellis (2012) and tested the tentative instrument on users of the Zombies, Run! gamified service.

3.1.1 Measure development

An initial pool of items was generated for each of the seven predicted dimensions. This generation was guided by three sources: (1) definitions of the dimensions developed in study 1 (Table 3); (2) the qualitative data on the dimensions and their underlying nodes from study 1; and (3) scales and theory on the game experience used in digital games research. By using these sources, we aimed to generate items that both honored former knowledge from game experience research, while also making GAMEFULQUEST sensitive to the specific nuances of the game aspects of gamified services. The definitions were particularly important. Content validity is heavily dependent on how well items reflect the measured construct’s definition (DeVellis 2012); therefore, a prerequisite for items to be included was that they measure the definition of a specific dimension rather than the dimension name. We also followed the recommendation of (DeVellis 2012) and did not reverse-code items, because doing so could negatively impact their performance (DeVellis 2012; Harvey et al. 1985; Podsakoff et al. 2003).

This step resulted in 73 tentative items, which were reviewed by an expert panel of two psychology scholars and one gamification scholar. Subsequently, several items were dropped, rewritten, or added, resulting in an initial pool of 65 items. Using Fry’s readability graph (Fry 1977), we determined that the reading difficulty of these items were at a fifth-grade level, which is adequate (DeVellis 2012) for scales aimed at the general population.

3.1.2 Participants

We recruited a convenience sample of respondents from among followers of a Zombies, Run! Twitter account offered by Six to Start, the company that developed Zombies, Run! Respondents who completed the survey were entered into a prize draw for one of 25 Amazon gift cards worth US$10 each. We used a screening question to ensure that participants had at least some experience with Zombies, Run!

The survey was completed by 371 respondents (female: 60%; undisclosed gender: 2%; age: M = 38). People from 30 different countries participated, with the five most common countries of origin being United States (50%), United Kingdom (15%), Canada (8%), Germany (6%), and Australia (5%). Eighty-two percent of the respondents who started the survey finished it.

3.1.3 Materials

The survey was created and distributed using SurveyMonkey. A seven-point Likert-type of scale was used, ranging from “strongly disagree” to “strongly agree.” SurveyMonkey was set to block multiple attempts to fill out the survey from the same device. To increase reliability and get a clearer factor structure, items were clustered according to their respective predicted dimension (Goldberg 1992). To avoid any systematic order effect (Perreault 1975), the dimensions were displayed randomly and the items within their respective suggested dimensions were also displayed randomly. The final tentative instrument is presented in “Appendix A”.

3.1.4 Procedure

Followers of the Zombies, Run! Twitter account were informed via a tweet about the study and the prize draw. To participate, respondents followed a link to an online survey. The prize draw was conducted after the data collection had ended.

3.2 Results

Descriptive statistics (Table 4) for the seven predicted dimensions showed that their mean values gravitated toward higher values. Nonetheless, both skewness and kurtosis indicated that the data was normally distributed. Cronbach’s alpha was > 0.7 for all predicted dimensions, which indicates reliability (Nunnally and Bernstein 1994).

Table 4 Mean, standard deviation, Cronbach’s alpha and distribution of the seven predicted dimensions

We tested dimensionality using principal components analysis. The data were deemed suitable for this purpose since (a) the correlation matrix showed coefficients above .3 between most items with their respective predicted dimension; (b) Bartlett’s test of sphericity was significant (χ2(2080) = 16,600.30; p < .001); and (c) the Kaiser–Meyer–Olkin measure of sampling adequacy (.95) was above the cut-off value 0.6 (Tabachnick and Fidell 2013).

Nine eigenvalues above one were revealed. Factors based on these nine eigenvalues explained 64.2% of the variance. The predicted dimensions confirmed as factors during the analysis were accomplishment, challenge, competition, guided, immersion, and social experience; however, the predicted dimension of playfulness was split into two factors. There was also a new ninth factor whose items all cross-loaded (all items loaded at least close to 0.4 on another factor) (“Appendix A”). However, when using the criterion value obtained from Parallel Analysis (Horn 1965), using the software Monte Carlo PCA for Parallel Analysis (Watkins 2006), only six factors emerged. In this case, the factors accomplishment, immersion, social experience, competition, guided, and one of the two factors from the predicted playfulness dimension were above the criterion eigenvalue. These six factors explained 57.6% of the variance.

3.3 Discussion

In this second study, we sought to develop and test a tentative instrument for measuring the seven dimensions of the gameful experience found in Study 1. We also sought bases for improvements to reach adequate psychometric properties. Accordingly, our findings in Study 2 verified the dimensionality of the factors accomplishment, competition, guided, immersion, and social experience, and we are able to present the suggestions below to improve psychometric properties.

The results regarding the predicted dimension challenge were contradictory. When using an eigenvalue of one as the cut-off level during principal components analysis, we confirmed its dimensionality. This was not the case when using parallel analysis. However, since one of these methods was supportive, we retained this dimension, allowing the results of Study 3 to guide the final decision after addressing poorly performing items.

Playfulness caused several problems. Most notably, the predicted dimension split into two factors, one of which had three items that cross-loaded on the immersion factor. In addition, when using parallel analysis, only one of these factors reached the cut-off eigenvalue. Furthermore, the mean value and standard deviation of the predicted playfulness dimension did show signs of a roof effect. Since playfulness was found to be a dimension in Study 1, we retained it for theoretical reasons. However, we considered removing the cross-loading and low-loading items during Study 3. In addition, the split of the predicted playfulness dimension into two factors might have been caused by the roof effect, which reduces variance. We addressed this roof effect during the third and final study.

An additional ninth factor emerged when using eigenvalue 1 as the cut-off level. Since all of this factor’s items cross-loaded on other factors, it did not stand on its own. In addition, it was not found to be a dimension in Study 1, so there were no theoretical reasons for keeping it. Consequently, we aimed to remove it by excluding problematic items during Study 3.

We can safely assume that followers of a Zombies, Run! Twitter account have a more positive attitude towards the service than other users. In fact, the means of all dimensions were above four (Table 4). Since four is the midpoint of the used scales, the means should preferably be centered on four. The above-mentioned roof effect on the playfulness dimension is one example of how this sampling method might have affected the study. Therefore, in the third study it was imperative to sample participants who had a more varied attitude toward the investigated service.

4 Study 3: Confirming the instrument

4.1 Method

The aim of the final study, which was quantitative in nature, was to reach satisfactory psychometric properties by developing the instrument using the results of Study 2 as input. Continuing to rely on the process described by DeVellis (2012) for scale development, we improved the instrument and then tested it on users of the gamified service Duolingo.

4.1.1 Measure development

As described in the discussion of Study 2, we developed the measure to (a) weed out the ninth factor, (b) improve the factorial properties of the challenge dimension, and (c) improve the playfulness dimension and evaluate whether it indeed divided into two factors. We also took two more general actions: we eliminated ill-working items of the full instrument to improve its psychometric properties, and we reduced the number of items, albeit on a limited basis because we prioritized the explanatory richness of the instrument.

In this way, eight items were removed. We utilized a general cut-off level of 0.4 for factor loadings, such that if an item loaded less than 0.4 on a factor, it was subject to removal. In addition, when an item cross-loaded more than 0.4 on two factors, it was subject to removal. Thus, in effect, we inverted the usage of our 0.4 cut-off level to handle cross-loading. Finally, we removed some items simply to decrease the number of items. In these cases, our rationale was low loadings compared with other items within the dimension or concerns regarding the construction of the item. Item-specific rationales for the removals are presented in Table 5.

Table 5 Cause for removal of items

4.1.2 Participants

We used a convenience sample. Because the sampling method used in Study 2 seemed to generate an overly positive attitude toward the investigated service, we recruited participants from sources with a more varied focus for this study, such as Internet forums focusing on Duolingo and those focusing on language learning in general. We expected the latter to include Duolingo users (and former users) who had more shifting attitudes towards Duolingo. Participants who completed the survey were entered into a draw for one of 25 Amazon gift cards worth US$10. We used a screening question to verify that users had experience with Duolingo. The sample consisted of 507 respondents (male: 61%; did not disclose gender: 4%; age: M = 38) from 52 countries. The most common countries of respondents were the United States (44%), the United Kingdom (10%), Canada (5%), Australia (4%), and Germany (3%). The completion rate among participants who started doing the survey was 52%.

4.1.3 Materials

We used SurveyMonkey to create and distribute the survey and included a seven-point Likert-type of scale that ranged from “strongly disagree” to “strongly agree.” Participants were blocked from completing the survey multiple times from the same device. The items were clustered according to dimension to improve reliability and to get a clearer factor structure (Goldberg 1992). In addition, both the dimensions and the items within the dimensions were displayed randomly to avoid order-effect bias (Perreault 1975). The final instrument can be found in “Appendix A”.

4.1.4 Procedure

The survey was published on numerous Internet forums that either had an explicit focus on Duolingo or a focus on general language learning. Participants were informed about the study, including the prize draw in the forum post. The respondents who choose to participate followed a link to the online survey. The prize draw was initiated after the data collection was complete.

4.2 Results

The descriptive data (Table 6) demonstrated that the roof effects that were present in Study 2 were mitigated in this study. Instead, the mean values of the dimensions were centered on the midpoint four, which indicates a less uniformly positive attitude towards the service compared with Study 2. For all predicted dimensions, the Cronbach’s alpha was well above the cut-off level of 0.7 (Nunnally and Bernstein 1994), and the data were normally distributed in all predicted dimensions, except for “accomplishment,” which had a slightly (but not problematic) high skewness and kurtosis.

Table 6 Mean, standard deviation, Cronbach’s alpha and distribution of dimensions

We repeated the principal components analysis in order to investigate the inconclusive dimensionality for some factors encountered in Study 2. The data were adequate for factor analysis because (a) the correlation matrix showed correlations above .3 for all items and their respective predicted dimension, (b) the Bartlett’s test of sphericity was significant (χ2(1540) = 22,274.80, p < .001), and (c) the Kaiser–Meyer–Olkin measure of sampling adequacy (.967) was above .6 (Tabachnick and Fidell 2013).

The principal components analyses revealed seven factors (Table 6) when using both an eigenvalue of 1 and the eigenvalue received from parallel analysis (Horn 1965), using the software Monte Carlo PCA for Parallel Analysis (Watkins 2006) as the cut-off level. These seven factors explained 67.3% of the variance. No items loaded less than 0.4 on a factor and no item cross-loaded such that it loaded more than 0.4 on two factors. Therefore, the dimensionality of all predicted dimensions was confirmed and the problems emerging in Study 2 were mitigated.

Because the dimensionality was confirmed for each of the seven predicted dimensions without the need for alteration, we were able to perform a confirmatory factor analysis using a fully a priori specified model. We conducted this analysis using (a) maximum likelihood estimation, (b) measurement errors that were presumed uncorrelated, and (c) factors that were left free to correlate (Fig. 1).

Fig. 1
figure 1

The complete a priori specified model evaluated in the confirmatory factor analysis (items can be found in “Appendix A”)

The analysis showed that all factor loadings were statistically significant. All factors showed convergent validity using AVE ≥ 0.5 as the cut-off value (Bagozzi and Yi 1988). All factors showed discriminant validity using the Fornell–Larcker criterion (Fornell and Larcker 1981), although accomplishment was close to non-discriminant from challenge, playfulness, and guided. In addition, both accomplishment and playfulness were quite strongly correlated with several other factors (Table 7).

Table 7 Convergent validity (AVE) and discriminant validity (Fornell–Larcker criterion)

Regarding model fit, the Chi square test was significant, which could indicate bad fit (χ2 = 3019.984, df = 1463, p < .001); however, this result could be expected because of the sample size and the complexity of the tested model (Hair et al. 2010). Following the suggestion of Brown (2006), we reported CFI, TLI, SRMR, and RMSEA to cover various information regarding model fit. Both CFI (.928) and TLI (.924) were above .9, which indicates adequate fit, considering the present sample size and the number of observed variables (Hair et al. 2010). RMSEA (.046 [90% CI .044–.048, CFit = .998]) was below .06 and SRMR (.0561) was below .08, which indicate good fit (Hu and Bentler 1999). All in all, we can conclude that the data fit our model well.

4.3 Discussion

In Study 3, we aimed to improve the instrument using the results of Study 2 as input. As a result of these improvements, we confirmed a psychometrically sound instrument, as presented in “Appendix A”. The most pronounced improvements were that (a) the ninth factor was weeded out, (b) the dimensionality of the factor challenge was improved, and (c) the predicted playfulness dimension emerged as one factor.

However, some issues remained. The factor accomplishment was close to non-discriminant from either playfulness, challenge, or guided. In addition, both accomplishment and playfulness were quite highly correlated with several other factors. Therefore, there are indications of a possible internal structure among the dimensions, which may need to be examined further in future research.

One could also argue that the change in the service we chose to investigate in Study 3 (Duolingo) may have improved the dimensionality rather than alterations in the instrument. Therefore, we may not be able to generalize the results to Zombies, Run! or for that matter to other services. However, for the dimensions showing adequate dimensionality in both studies, the results in Study 3 indicate such generalizability, notwithstanding other issues discussed below.

5 General discussion

This research makes two main contributions. The first is a validated instrument with adequate psychometric properties that can be used to model and measure the individual user’s gameful experience when using a service. Second, this study develops the understanding of the gameful experience by identifying seven dimensions that collectively describe this experience. These dimensions are accomplishment, challenge, competition, guided, immersion, playfulness, and social experience.

A common approach to scale development is to use exploratory factor analysis to find latent variables from a set of items. However, factor analysis can identify these variables based on the design of items rather that the existence of a construct (see, e.g., DeVellis 2012). As such, a model that appears to define a construct might arise from poor craftsmanship and fail to be a valid description of the world. To avoid this problem, we chose to develop GAMEFULQUEST using a mixed-methods approach, beginning with a qualitative study in which the gameful experience was described through its sub-dimensions. Using a combined inductive and deductive (based on game experience research) approach, we were open to unique aspects of using gamified services, while still honoring the research that had already been conducted on game experiences within games research. In two subsequent quantitative studies, we sought to find a one-dimensional scale for each of these dimensions. Thus, the qualitative analysis rather than the design of specific items guided the construction of the model and, therefore, the instrument. This approach, in addition to application of the extensive process for scale development suggested by (DeVellis 2012), resulted in an instrument with both validity and reliability.

While all of the dimensions included in GAMEFULQUEST except guided are present in existing measures and models of the game experience, the combination constituting our model is different from existing game experience models. Several aspects contribute to this difference. Some of the discovered dimensions have clear utilitarian properties (see, Davis 1989). Guided can be categorized as such and the social experience is also partly utilitarian, stemming from, for example, the usefulness of belonging to a community (Hamari and Koivisto 2013)—partly since social relationships are also enjoyable (e.g., Ryan and Deci 2000). Gamified services have both utilitarian and hedonic aspects. These utilitarian aspects stem from the implemented utilitarian functions in a system or service (Hamari and Koivisto 2015). However, the present study reflects the fact that the affordances for gameful experience can also afford such utilitarian experiences.

Some scholars have defined gamification as pertaining to non-game contexts (Seaborn and Fels 2015; Deterding et al. 2011). Therefore, even though this strict view has been criticized (see, Huotari and Hamari 2017 for discussion), the usage of gamified services might pertain to contexts beyond the usage of the service per se. One example is social aspects of gamified services. Within games research, the social experience is often missing when conceptualizing the game experience (Gajadhar et al. 2008), arguably because it may be considered secondary (e.g., Calvillo-Gámez et al. 2010). In our qualitative study, we found a strong prevalence of social aspects. However, these aspects stemmed from various contexts, including contexts beyond the service per se (the service, social media, and the physical world). This strong prevalence meant that such experiences could not be considered secondary. In fact, social experience and competition were both included in our model, which made social aspects an important part of our conceptualization of the gameful experience.

The goal of gamifying is to intrinsically motivate a behavior (Hamari et al. 2014; Mora et al. 2015; Huotari and Hamari 2017; Rigby 2015; Seaborn and Fels 2015), a behavior that often does not occur while using the service per se. This means that the motivational effect of the gameful experience must extend beyond the in-game phase and extend into the postgame phase (see Elson et al. (2014) regarding game phases). We refer to accomplishment as a demand or drive for successful performance, goal achievement, and progress (Table 3). These demands or drives reflect an engagement both to the service and to the target behavior. Therefore, adding such accomplishment when conceptualizing the gameful experience reflects the existence of a target behavior and, consequently, experiences that extends into the postgame phase. Considering the goal of gamification, we believe that this type of dimension of the gameful experience is essential.

Finally, a difference is the lack of possibilities for gamified services to create the same immersive sensory experiences that games have (Hamari and Koivisto 2014). Therefore, since presence will occur when a user do not acknowledge the medium (Lombard and Ditton 1997), it is unsurprising that this type of experience did not emerge within the qualitative data.

In sum, these are unique aspects of the gameful experience that are reflected in our model. Consequently, we believe that the GAMEFULQUEST is a better way of describing the gameful experience than reusing existing game experience measures. While many of these aspects are also missing in GAMEX, there is one aspect that is reflected both in GAMEX and in the game experience literature that is missing in GAMEFULQUEST: the experience of playing games can be negative. For example, GAMEX includes the absence of negative affect dimension, Jennett et al. (2008) discussed negative affect and anxiety, and Ijsselsteijn et al. (2008) included the dimensions negative affect and tension/annoyance in their Game Experience Questionnaire. Because our model does not include such negative aspects, it paints a relatively positive picture of the gameful experience. However, these negative aspects are described as emotional responses both in GAMEX and within games research. In our view, such emotional responses are an experience on a different and more general level (e.g., Lazarus 1991; Cosmides and Tooby 2000) compared to the level that we believe needs to be in focus to describe the unique aspects of the gameful experience. Therefore, due to these general qualities, we believe these negative aspects are better described as outcomes of the gameful experience. In fact, we consider them to be the outcome of specific dimensions of the gameful experiences; for example, immersion is sometimes associated with negative emotions and anxiety (Jennett et al. 2008), and challenges might be associated with anxiety (Csikszentmihalyi 1975). Consequently, we do not consider any of the dimensions of GAMEFULQUEST to be inherently negative (or positive for that matter), although the outcome of the corresponding experiences might be.

Personalization has developed into a fast-growing area within gamification research (Böckle et al. 2017), and GAMEFULQUEST opens the way to explore how the gameful experiences can be used for user modeling and user adapted interaction. However, contrary to, for example, Orji et al. (2017) and Orji et al. (2014), GAMEFULQUEST focuses on a state and not a trait. Therefore, while our measurement model is not a “user model” per se, it is a model of the individual experience of gamefulness. This means that it can be used directly to tailor services and their interface to the individual; for example, users of different skill will find a difficulty level more or less challenging, so the service can adapt by lowering the difficulty level; alternatively, one user might find the service less competitive than other users, so a competitive game design element, such as a leaderboard, might be added. This enables service providers to aim to create a specific gameful experience or a specific level of this experience. This state perspective also has the potential to make services adapt to changes of the gameful experience over time (for example, the experience of being challenge declines); a type of continuous adaption to the progress and skill of a user previously suggested both for games (Georgiou and Demiris 2017) and for gamified services (Afyouni et al. 2017; Hocine et al. 2015). However, doing this with GAMEFULQUEST comes at the cost of continuous measurement for the user.

GAMEFULQUEST could also be used to inform user-modeling research. As we have pointed out throughout this paper, the behavior targeted for change when gamifying is driven by the gameful experience that is created (Huotari and Hamari 2017; Seaborn and Fels 2015; Werbach 2014; Landers et al. 2018). This means that the evaluation of user adaption for gamified services is dependent on the successful measurement of the individual’s level of this experience. It has been found that game-based persuasive strategies focusing on social aspects like competition, comparison, and cooperation can have both positive and negative effects depending on the user’s personality type (Orji et al. 2017) or gamer type (Orji et al. 2014). These are examples of where GAMEFULQUEST could be used to further explore concepts that are closely related to parts of our model. In addition to these dimensions, GAMEFULQUEST opens the way for such exploration of the full range of experiences afforded by gamified services.

Finally, it is important to point out that GAMEFULQUEST is not limited to user modeling and personalization, but it is an important tool for understanding gamification, whether intentional or unintentional, in all contexts and, as such, understanding the cultural shift towards more gameful experiences in people’s day-to-day lives.

5.1 Limitations and future research

This study was based on Internet surveys, which generally comes with both general and unique challenges (Vehovar and Manfreda 2017). The recruitment method for the studies excluded participants who were not users of Twitter, forums, or Microworkers, and no adequate directory of users was available, which leaves the sampling method non-probabilistic. In addition, recruitment was done using a general invitation, which accentuates the non-probabilistic nature of the studies. As such, self-selection of the participants makes it impossible to assess nonresponse problems and creates a sample that might not be representative of the population of interest (Fricker 2017). To generate truly generalizable results, it is necessary to have a study that uses a probabilistic sampling method, a quality sampling frame, and adequate follow-ups to improve response rates. Such a study, in cooperation with a service developer willing to contribute an appropriate sampling frame consisting of both users and former users, would be preferable.

The game aspects of a service can be more or less obvious to a user, particularly when considering that services can be both intentionally and unintentionally gamified. When using Zombies, Run!, it seems probable that the users can see the app as a game; however, they will probably not have the same reaction to gamification implemented by, for instance, adding points and levels for purchasing coffee. In the latter case, an item like “gives me a sense of being separated from the real world” (part of the GAMEFULQUEST measure) might seem both strange and out of context. A similar reaction may result if the instrument is used for an experiment in which the control condition includes a non-gamified solution. Our study did not test the instrument on services that are not gamified or are unintentionally gamified. In addition, the instrument is only validated on two intentionally gamified services, both of which were part of the qualitative study. Even though we aimed at a generalizable model (to other services) by covering all types of affordances for gameful experiences found in earlier research by Hamari et al. (2014), more studies are needed to establish whether the instrument is truly generalizable. Therefore, verifying the generalizability on other services, including intentionally gamified, unintentionally gamified and non-gamified services will be a valuable contribution to the development of this instrument.

Discriminant and convergent validity have been established among the dimensions of GAMEFULQUEST. However, neither discriminant nor convergent validity with a construct beyond the dimensions of GAMEFULQUEST has been established. This is an important step for future validation of this instrument.

Even though we have described how GAMEFULQUEST can be used for user modeling and user-adapted interaction, it is not validated within this context, so further research within this area is needed. One specific research question would be which specific affordances for gameful experiences will affect specific dimensions of the gameful experiences. This would preferably result in an extensive mapping of affordances for gameful experience to the seven dimensions of the gameful experience, which could be used for user adaption. However, it is important to mention that the gameful experience is not necessarily created with what traditionally are considered game elements (or that it will increase due to their amount alone). Their creation is a more complex issue and GAMEFULQUEST will be an important tool for targeting the important empirical research problem of how gamification affects the gameful experience.