1 Introduction

The game can be recognized as a kind of service that provides game players with different experiences. With the continuous development of the game industry, more and more interdisciplinary knowledge and theories are being used. As shown in Fig. 1, game analytics derives from Business Intelligence (BI). It reflects the combination of BI with game research [1].

Fig. 1
figure 1

Game analytics research

Davenport and Harris [2] define BI as something that incorporates the collection, management, reporting of decision-oriented data as well as the analytical technologies and computing approaches that are performed on that data. Analytics is used for querying and reporting BI data and making advanced analytics including statistical analytics, predictive modeling, optimization, forecasting [2]. The purpose of analytics is to solve problems, make predictions in business, help decision making, promote optimization actions, and improve business performance. Game analytics is the process of identifying and communicating meaningful patterns that can be used for game decision making [1].

The purpose of this paper is to make a comprehensive literature review on the application of BI in the game industry, especially for game analytics, and provide a reasonable classification for those application areas. The rest of the paper is structured as follows: Sect. 2 provides the foundations of game analytics; Sect. 3 describes the methodology used for the comprehensive literature review; Sect. 4 presents the detailed analysis process, including identifying dimensions and codes. Section 5 discusses the results and conclusion obtained by this literature review; Sect. 6 identifies the problems and challenges in this literature review; Sect. 7 summarizes the answers to our research questions; Sect. 8 makes conclusions about our literature review and discusses the contribution to the future research.

2 Foundation

As shown in Fig. 2, there are several branches of analytics, such as marketing analytics, risk analytics, Web analytics, and game analytics based on the previous classification [1].Game analytics has already been used in the game industry for many years. Kim et al. [3] discuss how game analytics can be used to identify in-game balancing issues. Hullett et al. [4] use it to reduce game development costs and avoid risks in game development. Moura et al. [5] apply it to visualize players’ movement paths on the map and identify the blocking points from the player side. Zoeller [6] also provides a solution to detect in-game bugs by game analytics. However, most of the research only focuses on game development and game research [1]. As for mobile game analytics, Drachen et al. [7] point out this field of research is in its infancy and the available knowledge is heavily fragmented.

Fig. 2
figure 2

Relationship between BI and game analytics

The latest literature review is provided by Alonso-Fernández et al. [8]. They systematically summarize the research progress of applying data science and technology to serious game analytics to support decision making. First, they discussed the purpose of data science applied to the serious game. Second, they listed the related data science algorithms and analysis techniques usually used for learning analytics. Third, they discussed the research scope which mainly focused on the education side. Finally, results and conclusions are drawn, especially for why and how to apply data science and technology to serious game analytics. Besides this, Fernandes et al. [9] also provide a survey on game analytics in massively multiplayer online games. They summarize the techniques used in the analysis of massively multiplayer online games published in the past 14 years. The purpose is to outline the latest research areas, and 31 papers are selected from the IEEE and the ACM digital library for analysis.

As for the previous literature review, first, all of them mainly focused on specific games and did not provide the application and analysis for the whole value chain of the game industry. Second, the previous literature reviews were more about the summary of papers rather than giving the classification of relevant papers. Finally, previous studies lacked further discussion on the current research problems, and no specific suggestions were given for game analytics research. These limitations motivate our further research, especially for combining the game industry value chain to carry out a reasonable classification and then present the research status and trends.

Although game analytics is currently widely used in the game field, it lacks reasonable classification. Based on the previous literature review, there is no systematic research and reasonable classification about BI used in the game industry, especially for the game analytics. According to the traditional game value chain shown in Fig. 3, the game industry starts with game developers responsible for developing the games. When the game is ready, they will find a game publisher to help with the game publishing. The publisher will publish the games through the game distributor and connect with potential players. As the traditional game value chain covers the whole game industry [10], it is reasonable to use it as the base for game analytics classification.

Fig. 3
figure 3

Traditional game value chain

Following the traditional game value chain, we can make a preliminary classification, and it is reasonable for game analytics. At least, it should include four parts: the game development analytics, game publishing analytics, game distribution channel analytics, and game player analytics, as shown in Fig. 4.

Fig. 4
figure 4

Game analytics classification based on value chain

Game player analytics is vital for game analytics. The core of player analytics is to analyze the game behavior and specific preferences and guide the right direction of game development. This kind of game player analytics is based on the player segmentation, including the motivation of playing games and player game experience. The game development analytics includes verification of the gameplay, interface analytics, system analytics, process analytics, and performance analytics [1]. As for game publishing analytics, it mainly focuses on player acquisition, retention, and revenue analytics. Channel analytics primarily focuses on analyzing the distribution channel’s attributes and provides specific solutions for game promotion.

As for game metrics, it can be defined as the behavioral data source used for game analytics. El-Nasr et al. [1] present the advantages of the metric as other sources of BI, which can be used for decision making in the game industry. Metrics can be variables, features, and calculated values. The relationship between game metrics and game analytics is that game metrics are the numbers to track the game performance or development progress. Game analytics can use metrics to find out the trends and changes and support decision making.

According to the classification [1], game metrics can generally be divided into three categories, as shown in Fig. 5, including the player metrics, process metric, and performance metrics. Usually, player metrics focus on player behavior and also customer research. The process metrics are used for game development process monitoring and management. For the performance metrics, they have a deep relationship with the game technical monitoring such as frame rate, number of bugs, and game client execution performance.

Fig. 5
figure 5

Game metrics classification

3 Research method

A literature review is a type of review research paper that includes the classification and substantive findings and theoretical contributions to a particular topic [11]. The main driver for conducting this review was to categorize and summarize all the work done around game analytics, identify the potential gaps and commonalities in current research, and form a baseline for future research. The search strategy contained the following design decisions: Searched databases: IEEE and ACM Digital Library, Scopus, and Google Scholar. The reason to choose these databases is these expected to cover most of the researches on game analytics. As shown in Fig. 6, the online search interest trends in Google's game analytics keywords have increased from 2005 till 2019. We can see from the overall trend that game analytics is of increasing interest from 2009 onwards.

Fig. 6
figure 6

Online search interest trends on game analytics by Google

3.1 Research questions

The main goal of this literature review is to explore the applications of BI used in the game industry, which mainly focuses on the game analytics side. For this purpose, we have stated the following main research questions:

  • RQ1: What aspects of game analytics have been explored so far in the available literature?

  • RQ2: What are the main purposes of using analytics in the game industry?

  • RQ3: What problems or challenges in the game area can be addressed by using game analytics?

  • RQ4: What kinds of algorithms have been used in game analytics for prediction?

  • RQ5: What research areas have already been covered in literature, but still need further development?

3.2 Data collection

Game analytics is a broad concept used in the game industry. According to the search results, we made a table about relevant search criteria, as shown in Table 1. We got related papers from the databases about game analytics. Then we focus on the classification and different subdivisions about the game analytics based on the game industry value chain as game analytics can be used not only in academia, but also in the game industry.

Table 1 Search criteria for relevant research

3.3 Database searched

In practice, we have queried several databases, including some of the primary databases for computer science and general scientific research. Specifically, we mainly searched: IEEE and ACM Digital Library, Scopus, and Google Scholar.

3.3.1 Search terms

To perform the searches on the databases, we focus on our three main terms of interest: Business Intelligence, game, and analytics. Based on the traditional game value chain, the terms “game analytics” can be used throughout the process, so we conducted several parallel searches. All searches are restricted to title, abstract, and keywords.

3.3.2 Study selection

After removing duplicates, we follow the four steps screening process. Step 1, we scanned the titles and abstracts of all papers and then compared them with the inclusion and exclusion criteria defined below. Step 2, after the first scan, the research was classified as possible or excluded. Step 3, we considered potential research issues and reviewed relevant papers to ensure that they provided sufficient information for our literature review. Step 4, we also make a detailed selection of each paper and only choose the highly relative paper with the game analytics.

  • Inclusion criteria

    Journals, conference papers or books which include empirical evidence relating the game analytics.

  • Exclusion criteria

Publications whose full text is not available to download and publications that only focus on the serious game and also the publications not written in English.

The reason why we exclude the research on serious games as the previous literature review on this area already exists [8]. As shown in Fig. 7, the related literature review selection process, we searched the database for 1446 papers related to game analytics. We removed duplicate papers and also papers focusing on the serious game. Then, we got 264 articles. Based on these papers, 71 papers were found to be highly correlated with game analytics through further screening of paper title abstracts and content. Then the remaining 71 papers were analyzed in detail.

Fig. 7
figure 7

Related literature review selection process

4 Review process

4.1 Identification of dimensions

To analyze prior literature, a comprehensive, hierarchical coding system was established. Based on the literature review, according to the traditional game value chain which is shown in Fig. 3, the game industry starts with the game developers who are responsible for developing the games. As the traditional game value chain covers the whole game.

Industry, we therefore, used these as the starting point to determine the basic dimensions of our coding system. Following the traditional game value chain, it is reasonable for game analytics to include at least four parts: the game player analytics, game development analytics, game publishing analytics, and also game distribution analytics. With the in-depth study of our literature review, we found that there were many types of research on data visualization and game prediction in the collected literature. Therefore, we added two parts of game analytics research on data visualization and prediction based on the traditional game value chain.

4.2 Identification of codes

In the specific literature review process, we used the qualitative data analysis software MAXQDA to encode the paper, which is a useful qualitative data analysis tool. For the game analytics literature review, the whole research process is rigorous, as the code can be assigned to the text segments in the selected papers. Besides this, it also provides us a convenient method to automatically extract coded text paragraphs, so that all the research data can be synthesized. In summary, we perform a high-quality analysis of the cited papers related to game analytics and provides effective assistance for literature review and also make an in-depth analysis of all selected papers.

As shown in Fig. 8, based on the traditional game value chain, we introduce a coding system to code all the papers collected before and get the final result. For the coding process, we use two different levels of code. The first-level coding includes game development analytics, game player analytics, game publishing analytics, distribution channel analytics, game prediction analytics, and data visualization. For the second-level coding, based on the preliminary classification of game analytics [1], we introduced the sub-codes, including gameplay, performance, process, interface, system analytics, in-game behaviors, player segmentation, acquisition, retention, revenue analytics, churn prediction, and revenue prediction for further coding.

Fig. 8
figure 8

Research result for game analytics

5 Research results

According to our literature review and the coding system, we finally divide the game analytic research into six parts. It includes game player analytics, game development analytics, game publishing analytics, game distribution analytics, and also the game prediction, data visualization. In the following, we use these categories to structure the presentation of our literature review results.

5.1 Game player analytics

Game player analytics focuses on the player itself. Traditionally, player research uses qualitative methods as part of practices and make different surveys about the player experience, satisfaction, and engagement. Therefore, most of these studies are conducted through different interviews, in-depth questionnaires, and observations. However, in practice, most game player researchers use both qualitative and quantitative approaches. For example, Canossa et al. [12] use qualitative and quantitative research methods for game player analytics which aim to identify patterns of player behavior and point out potential frustration before players leaving a game. Besides this, by collecting game remote sensing data such as usability testing for game playability testing also provide more insight research on how players play these games and what kind of behaviors, they will make during the game experiences, such as the player in-game progression and distribution research [13]. In addition, the game player analytics with the highest playtime metrics can be used for guiding game design [14].

5.1.1 Player segmentation

As for the game players, game designers not only need to focus on the gameplay development, but also need to know who should be the potential players and what their requirements are. As discussed by Hamari and Lehdonvirta [15], the specific development needs to be carried out and meet the requirements of different game players based on the player segmentation. The recent trend is that in the early game design stage, more and more considerations will be given to the requirements from the different player segmentation side. This will make the game marketing promotion more effective [16]. The segmentation can be used to describe the differentiation to meet human requirements as accurately as possible [17]. In practice, in order to make sure the game is designed considering the requirements from specific players, segmentation is an effective way which aimed at identifying different player groups [18]. The goal of segmentation is to further classify the player groups and provides games more in line with the player requirements. In fact, players’ needs for games are diverse, so the motivations for users to play games are diverse. These researches are based on the breakdown of the player behavior and make the classification. Player segmentation can be used to target the motivation of different players during the game design process. Besides this, game providers should develop different marketing strategies for different segments of gamers [19]. Kallio et al. [20] discuss that immersion is an important indicator to guide and evaluated player behavior and motivation in games. In order to conduct more effective segmentation, player research needs to take into account. Stanton et al. [21] present the first step towards a method for self-refining games in which game systems can continually be improved by player analytics. They find out that game objectives cause players to explore only a small fraction of the entire state space. Based on that result, they make a data-driven simulation solution for players to explore more space and it also can be used for complex dynamical game systems.

5.1.2 Player behaviors

Player behaviors include in-game actions and behaviors, such as navigation, interaction with game objects and other in-game entities. Player behavior research involves specific in-game behaviors throughout the game experiences. Darken and Anderegg [22] provide a new concept that regards player behavior as simulacra. Then based on the simulacra, they provide the candidate movement models which meet the different types of players. Thawonmas et al. [23] suggest detecting game bots based on their in-game behavior, especially those related to the designed purposes of bots. This approach has the potential to distinguish between human player behaviors and automated program behaviors. Nacke et al. [24] focus on a quantitative study of player behaviors in a social game called Health Seeker. Through analyzing, they make a conclusion that having the well-connected in-game social network and also the in-game interaction can improve player performance in solving game missions. Besides this, Bauckhage et al. [25] point out how to use cluster analysis for game behavioral data analysis. They target the game data scientists and present a tutorial that focuses on the application of clustering techniques to the game player behavioral data. They emphasize the potential of cluster analysis, which can be used for game design and development. However, they also point out that the application of clustering techniques to player behavioral analysis is still in its infancy.

As a player, it is easy to generate thousands of behavioral measures during the game playing session. As every time a player inputs to the game system, it brings the reactions and responses. However, accurate measures of player activities include many actions that need to be calculated immediately. For example, players in some famous games such as World of Warcraft (WOW), measurement of player behaviors could involve in many data such as the position of player’s character, health, mana, stamina, character name, level, equipment, and also the currency. Usually, this information can be collected from the game client and also the game servers. El-Nasr et al. [1] point out that analyzing behavioral data from games can be challenging, especially for massively multiplayer games. Each of these games has thousands of simultaneously active players spread across hundreds of instances in the same virtual environment. Drachen and Canossa [26] point out that player behavior analysis is based on instrumentation data, automated, detailed, quantitative information about the player behavior within the virtual environment of digital games. Hadiji et al. [27] show the ability to model, understand, and predict future player behavior has a crucial value, allowing developers to obtain data-driven insights to inform design, development, and marketing strategies. Drachen et al. [28] improve player modeling using self-organization and provide an initial study on identifying different player behaviors in a commercial game. Wallner et al. [29] use lag sequential analytics (LSA) which uses of statistical methods to aid the analytics of behavioral streams of players. Their methods provide an effective way to do the player behavior analysis, especially when faced with large behavioral streams of players.

In brief, player analytics is the foundation of game analytics. It not only can guide the game design process based on the player requirements but also can discover potential problems in game development. Player research is also vital for game publishing, which can give clear guidance about game optimizations. It also can be used to improve game retention, deliver more game revenue, and extend the game life cycle.

5.2 Game development analytics

Game analytics has many applications in game development, mainly to monitor the process of game development. It includes some technical performance and indicators of game development, such as bugs and crash monitors. Hullett et al. [30] explore how data can be used for driving game design decisions in game development. They define a mixture of qualitative and quantitative data sources and present a case study and show how data collected from the launched game can guide the game development. Game development analytics originally focus on the analytics of core gameplay, interactive analytics, and in-game system analytics [1]. However, for the game development analytics research which needs to cover the whole game development process. A new classification should be provided as it not only includes the analytics of core gameplay, interface analytics, but also includes game system analytics, process analytics, and performance analytics, as shown in Fig. 9, game development analytics.

Fig. 9
figure 9

Game development analytics classification

5.2.1 Gameplay analytics

Gameplay is the core of a game that is used for representing how this game is played. It relates to the user's real behavior as a player, such as in-game interaction, items trade, and navigation in the game environment. Gameplay analytics is significant to evaluate the game design and player experience. Usually, the gameplay is used to collect feedback on potential unclear elements in the game, issues with the game controls, and more general feedback on the enjoyability of the game [31]. Analysis of gameplay data is crucial for evaluating design decisions and refining a game experience [32]. Medler et al. [33] present how a visual game analytic tool can be developed to analyze the player gameplay process. They develop analytic tools that can monitor millions of players after the game is launched. Mirza-Babaei et al. [34] provide new user research methods that have been applied to capture interactions and behaviors from players across the gameplay experience and find out the potential problems for the gameplay design. Emmerich and Masuch [35] discuss the gameplay metrics used to measure player behaviors. Besides this, they also present the conceptualization, application, and evaluation of three social gameplay metrics that aim at measuring the social presence, player cooperation, and leadership, respectively. Drachen and Canossa [36] point out that gameplay metrics are instrumentation data that can measure player behavior and game interaction. Their research focuses on utilizing game metrics for gameplay analytics and guiding the development of commercial game products.

5.2.2 Interface analytics

Interface analytics includes all interactions which player performs with the game interface and menus. It is usually be tracked by setting different game variables, such as mouse sensitivity, finger touch pressure, and also monitor brightness. The data analytics of the interface is based on the premise that all the menu and button settings can be recorded. Only through the recorded data will the click volume of the interface icon and the validity of the design be effectively analyzed. Interface analytics has a deep relationship with how players interact with the game UI and also the in-game interface and system. Xu et al. [37] analyze the bottlenecks in game design conventional practices and develop a single match module. This single match module can be used to familiarize players with interface interactive analytics.

5.2.3 System analytics

System analytics covers all the actions from game engines and also the sub-systems, such as Artificial Intelligence (AI) system, in-game events, and Non-Player Character (NPC) actions. System analytics can be used to measure the effectiveness of the system design. It also can give guidance to the game developer about how to design the game system effectively. At present, system analytics focuses on in-game systems research and give guidance about game development. Weber and Mateas [38] focus on the in-game system analytics in the game StarCraft. Their research becomes a component of an AI system that makes StarCraft better based on data analysis. System analytics also can help to improve the in-game system based on data analysis.

5.2.4 Process analytics

Game process analytics focuses on the game development process and gives monitoring about the game development and provides guidance about the detailed game development process, such as using the agile development method to manage the development process. Process analysis can effectively help developers improve game development efficiency, find out potential development problems, and polish them instantly. Hullett et al. [30] focus on the collection and analytics of game data, which can be used to inform the game development process's potential problems and guide the improvement. They made a summary of the right and wrong in the development process. By process analytics, developers can better anticipate and avoid problems in their game development. However, with the development of different games, the category of game genres continues to increase. So, we also need to create related metrics to measure the game development process for different kinds of games.

5.2.5 Performance analytics

Performance analytics relates to the performance of game technical and software-based infrastructure behind a game itself. It includes the frame rate, the stability of the client execute, bandwidth, game build quality, and the number of game bugs found by QA testing. For example, Wang et al. [39] measure and analyze WOW's performance as a representative of online games and focus on the application of different levels of packet statistics, such as the game delay and bandwidth consumption.

In brief, from the game industry side, effective data analytics during game development can help developers optimize games and verify the core gameplay, polish the interaction design, and improve the player experience. It can help developers make the right decisions, improve game development efficiency, and reduce the cost. However, in the real case, how to obtain the necessary game development data, how to effectively avoid mistakes in the direction of game development based on data analytics is still in need of further research.

5.3 Game publishing analytics

The initial game analytics focused on game development and game ontology research [1]. However, the application of game analytics in game publishing is limited and lacks systematic studies. Effective game analytics can help with the success of game release and marketing promotion and optimize the game in a targeted manner, extend the life cycle, and increase revenue. Moreira et al. [40] use the ARM (acquisition, retention, and monetization), funnel model as the basic analysis for game publishing. According to the ARM funnel model, game publishing analytics can be divided into three parts. As shown in Fig. 10, data analytics in the game publishing process include game acquisition analytics, retention analytics, and game revenue analytics.

Fig. 10
figure 10

Game publishing analytics classification

5.3.1 Acquisition analytics

Acquisition analytics focuses on how to save the cost of attracting new users. It also pays attention to how many new players enter the game, how many players finish the tutorial, and how much money they spend on user acquisition. In order to acquire more players, game developers usually first invest in the development and then authorize their games for publishing on target platforms. The publisher often needs to get users by buying ads or by viral distribution on social networks. Dheandhanoo et al. [41] describe an analysis-based approach to measure user acquisition effectiveness in marketing promotion. Although their analysis shows that marketing campaigns have been improved based on data analysis, there are still many potential ways to enhance user acquisition results and adjust marketing campaigns to make them more suitable for different players.

5.3.2 Retention analytics

Retention rate is a vital indicator of measuring the stickiness of games. This benchmark not only measures how players are engaged in the game, but also can be used for evaluating the game quality. The concept of retention rate comes from the marketing research side, which is provided by Hennig-Thurau and Klee [42]. They develop a conceptual foundation for investigating the customer retention process, with the use of the concepts of customer satisfaction and relationship quality. At first, the retention rate is the factor in analyzing users’ awareness of a brand. Then the concept of retention rate is applied to the game, especially in the analytics of player’s retention in games. Debeauvaisand et al. [43] analyze mechanisms of player retention in massively multiplayer games and focus on how to improve the game retention. Three key metrics are introduced which include weekly playtime, stop rate, and how long respondents have been playing. The analytics shows how the game can efficiently wield a powerful retention system. Their research also utilizes several metrics to measure the retention, including hours of play per week, stop rate, and also the length of time. However, the metrics are only the first step towards player retention in hardcore games. So, for other kinds of games, there is still a need for different kinds of game metrics to measure retention such as casual games which should have different metrics compared to hardcore games.

Demediuk et al. [44] focus on player retention research in League of Legends (LOL) by using survival analytics. Their study aims to understand the influence of specific behavior characteristics on the possibility of players playing the game. Survival analytics is the practical approach as it provides the ratio and assesses the features of the player who is at risk of leaving the game. The final results show that the duration between matches is a reliable indicator of retention. However, this paper does not discuss the effects of other factors on retention. As retention is a complex problem, there are all kinds of reasons which may lead the players to leave the game. So, for different games, retention is also different.

Park et al. [45] focus on the critical factors of player retention for different levels in online multiplayer games. They mainly discuss multiplayer game retention based on 51,104 various individual log analytics. They focus on exploring and analyzing the key factors influencing the players’ retention rate and the critical issue retained throughout the entire game stage. They find out that the key indicators retained varied with the game levels. The achievements of players within the game features are significant to become senior players. However, once the players arrive at the highest game levels, social networking features are vital for game retention. This finding pointed out that social networks positively affected retention when individuals form interactions with partners of appropriate standards. Yee [46] summarizes three motivational components that have a great relationship with retention: achievement, social, and immersion. Andersen et al. [47] found out that music and sound effects have little effect on player retention, but animations attract users to play more. They also discussed that minor gameplay modification affects player retention more than aesthetic variations, but how to generalize and apply these results to other game genres requires further research.

In order to improve game retention, social factors can be recognized as an effective way to improve the retention rate. Krause et al. [48] investigate the potential of gamification with social game elements for increasing retention. Players in the experiment show a significant increase of 25% in retention and 23% higher average scores when the interface is gamified. Besides this, Kayes et al. [49] analyze what factors led a blogger to continue participating in the community and releasing new content. The conclusion shows that users who face fewer constraints and have more opportunities in the community are more retained than others. In addition, the game events also influence the retention and the difference in how game users respond to events by character level, item purchasing frequency, and game-playing time band affect the retention [50]. Lucas et al. [51] analyzed successful mobile games to use persuasive mechanics for user retention. They link the persuasive mechanics to a base mechanic and corresponding psychological theory. The results can be seen as an addition to a toolkit for mobile game developers.

In short, the retention rate is a key indicator, especially for game publishing, which is an effective way to measure the game quality. It is also the critical benchmark to the game success in the market from the industry side. However, as for different types of games, how to make a set of unified metrics for measuring the retention still need to be studied further. Besides this, due to the analytics process, the acquired data will increase geometrically with the increase of the player’s information. How to effectively reduce the complexity of data analytics is also an important research direction.

5.3.3 Revenue analytics

With the rapid development of the mobile game, it takes up the largest share in the game industry. As most of the mobile games are free, players can download at any time. Hence, for freemium games, the revenue is mainly from game items, such as In-App Purchase (IAP) or advertising. Drachen et al. [52], through a case study of more than 200,000 players, analyze the relationship between the social features and the revenue in freemium casual mobile games. According to their research, classifier and regression models evaluate the impact of social interaction in casual games for the whole player’s life-cycle value. The final results show that social activities are not associated with the trend towards advanced players, but social activities will improve the game revenue.

As for freemium games, there is a big difference between the payment players who pay for IAP and the Non-payment players. Non-payment players consist of the majority of freemium players, which leads to highly uneven purchases in mobile games [1]. The key challenge for mobile game developers is to reduce the churn rate and increase players, not only by improving the retention rate, but also by considering the changes from the junior players to senior players. A related goal is to increase player’s life-cycle value (LTV) due to the significant increase in user acquisition costs for mobile applications in recent years. Considering the user acquisition costs and the market promotion fee continue to increase, the research to improve the game revenue is essential for game developers and publishers from the game industry side.

Alomari et al. [53] extract 31 features by a decision tree. The ten most important features for game success are found, which include the inviting friends’ feature, skill tree, leaderboard, Facebook, time skips, request friend help, event offers, customizable, soft currency, unlock new content. The results benefit game developers in increasing their revenue. The study also concludes that the highly related factor to revenue is the daily active user. Besides this, other features will also play a significant role in game success, such as culture, lifestyle, and loyalty to game brand and promotion. Hsu et al. [54] present a novel and intuitive market concept called indicator products used to analyze in-game purchases. Such kinds of researches benefit game designers and game researchers to observe player behaviors and improve game revenue.

5.4 Distribution channel analytics

In a broad sense, channels are the specific path to connect the games with players together. Most of the distribution channels aggregate a large number of users and form a platform for games such as the App Store and Google Play. Krafft et al. [55] provide four insights in marketing channels research, including the different marketing channel relationships, channel structures, popular topics, and market strategies. They also discuss the potential four trends that benefit from understanding the changes from channels' side, including service economies, globalization, reliance on technologies, and big data for channel decisions. Similarly, Cramer et al. [56] emphasize the contributions from researchers both from industry and academia side with experiences about the deployment and distribution. This research provides an overview of the challenges and methods to solve the distribution channels’ potential issues, such as the markets, new devices, and services. Latif et al. [57] calculate the IAP purchase rate of free and paid applications from the channel Google Play side. This research benefits the game developers and gives guidance for their development phases.

Channel distribution is vital for game success, especially for mobile games. It is common to reach potential players, such as the App Store and Google Play, which play an essential role in the distribution of mobile games. Channel analytics is to combine channel data with game data to provide decision support for game development and publishing. It is possible to compare different channels by player active rate, new player acquisition, retention, and the payment rate to determine the best channel for the game. Channels analytics cannot only identify the main target users, but also help build their loyalty relationship with the games.

However, at present, game channel research is listed as part of the market effect on statistics and analytics. The game attributes, such as the size of the game packages, may affect the player downloads. Different game channels have different player attributes, as well as the fact that the channels’ benchmark has an impact on the distribution of games, and the situation is complicated. So how to do the game analytics research combined with the attributes of channels, increasing the player downloads, and reducing the marketing promotion budget is the potential research gap, requiring more research to focus on this area.

5.5 Game prediction analytics

Game prediction analytics uses historical data to predict future events. Typically, historical data is used to build a mathematical model that captures essential trends and make the forecast for the future. So game analytics can be used to predict game performance in advance.

The churn prediction models have been developed at the early stage across different sectors, such as wireless communication, banking, and insurance. However, as for games, previous work on in-game prediction mainly focused on Massively Multiplayer Online Role-Playing Game (MMORPG) and Free-to-Play (F2P) mobile games. Tarng et al. [58] provide a prediction model for MMORPG gamers, which takes a player’s game hours as the input and predicts whether the player will leave soon. Hadiji et al. [27] develop a generic applicable churn prediction model, which does not rely on game design specific features, and it can be used for the churn prediction in F2P games. According to our literature review, there are currently different algorithms that can be used for churn prediction. Related research mainly includes Decision Trees, Random Forest, Support Vector Machines, Neural Networks, and the Hidden Markova Models [59].

Kim et al. [60] focus on churn prediction of mobile and online casual games. As churn prediction and analysis can provide essential insights and action cues on retention, its application using play log data has been primitive or very limited in the casual game area. They develop a standard churn analysis process for casual games. Runge et al. [61] predict the departure of high-value players in two F2P games by comparing different classifiers and feature sets. They also provide a quantitative definition of the high-value player segment, defined the churn event, and formulated the prediction as a binary classification problem. Borbora and Srivastava [62] focus on user behavior modeling by adopting a player lifecycle-based approach to predict churn for online games. They analyzed the activity characteristics of the churn players and compared them to regular players’ activity characteristics. The analysis results show that their method has a better prediction that can provide essential insights into the MMORPG games’ churn.

Perianez et al. [63] describe a churn detection method for social games. It provides a comprehensive analysis to predict player churn accurately. For each player, they predict the possibility of changes over time, which allows them to distinguish different levels of loyalty. The results show that disturbance prediction improves the accuracy and robustness of the traditional analysis. Besides, the social behavior of players in mobile games will also affect retention.

About the revenue prediction, Sifa et al. [64] focus on predicting future purchase activities by formulating the process as the combination of the player classification with a regression problem and provide the solution. Predicting payment players is the first step in building a revenue forecast model. The algorithm includes Decision Trees, Random Forest, Support Vector Machine, and Poisson Trees. The Random Forest provides excellent results and also the Poisson Trees according to three different observation periods: 1 day, 3 days, and 7 days. However, there is currently no standard about which day is better, but 7 days is widely used in the game industry. Xie et al. [65] focus on predicting the first purchase behaviors in two social games. They start to use the frequency of game events as data representations to predict first purchase.

The previous work on game prediction mainly focuses on the churn prediction and revenue prediction, as shown in Fig. 11, the game prediction with different algorithms. The goal of game prediction is to investigate the relationship between accuracy and game actions, which means observing more in-game activity yields more accurate predictions. However, the predicted data results need to be analyzed in combination with the game itself to make the correct game optimization decisions. The overall prediction also needs to be adjusted according to different game system changes due to most of the prediction processes are dynamic. In short, the prediction method only infinitely close to the actual situation in theory, and the real value of the prediction is to provide sufficient correct guidance for game development and optimization.

Fig. 11
figure 11

Game prediction with different algorithms

5.6 Game data visualization

The game data visualization is an essential part of game analytics. Through the visualization of the data, we can intuitively analyze the behavior changes of the players in the game, and easily understand the specific performance of game publishing. Drenikow et al. [66] provide a new tool to help collect and represent game test data, including a visual representation of players’ in-game data. They introduced tools for different visualizations of game prototypes. This tool can be easily used by game researchers to find out the.

potential issues. Lu et al. [67] introduce a new visual analysis system called BeXplorer. The system enables analysts to interact with collected data which also benefits for data analysis. The game developers can use it to collect and present game test data in an accessible and effective way. However, it is originally only designed to facilitate the analysis and visualization of various player behaviors in large MMORPG games. Latif et al. [57] make the visualization of the IAP purchase rate of free and paid applications, also the percentage of advertisement support in free and paid applications. This visualization benefits game developers in the development phases and the players for the game selection, especially for what kind of games they want to play.

Kang and Kim [68] introduce Spatio-temporal visualization technology that can be used for understanding the data more intuitively. They proposed a visual analysis technique to make data analysis easier. The player’s Spatio-temporal data can be used to directly show player behavior in the game world, which is convenient for game developers to improve the efficiency of the game development process. Drachen and Schubert [69] also review current work in the analysis of space–time games, define key terms, and outline current technologies and game applications. They finally summarized the current problems and challenges in this field and proposed the visualization methods of Spatio-temporal analytics and present four critical areas of spatial and Spatio-temporal analytics that benefit the game design and development as well.

6 Problems and challenges

The current practices of game analytics have attracted more and more attention. However, there are still some problems in the research at this stage. As game analytics, which can be recognized as using BI in the game industry, it can be applied throughout the whole game industry value chain. However, according to the literature review, there are still potential gaps in game analytics research. As shown in Fig. 8, based on our coding system, the research on game publishing analytics and the distribution channel analytics are underrepresented. Related research still needs to be extended, such as how to use data analytics to drive the game publishing effectively and how to set up the correct game metrics, which can be used for measuring the new game version update and also the channel’s performance. Based on our literature review, the related research areas are full of challenges.

6.1 Game player analytics challenge

Game player analytics is based on the behavior analytics of game users. Player behavior in the game changes instantly will bring a lot of data, such as MMORPG games. There will be a large amount of player data that needs to be processed. How to improve the efficiency of player analytics through the processing of massive data is the potential research area. However, the player analytics results are established on enough data collection, which will bring two problems. On one hand, how to obtain accurate results with little data to save the data collection cost is a potential issue. On the other hand, how to ensure that the collected data can represent the player real demands is also a challenge. Besides this, the goal of player segmentation analytics is to classify the player’s groups further and provide games more in line with the player requirements. It is also worth studying how to meet the needs of different players in the same game based on player analytics.

6.2 Game development analytics challenge

At present, most game analytics research for game development focuses on ensuring that the gameplay is sufficient to meet the player requirements and also that the game development process is controllable. However, how to ensure that the in-game system design can be adjusted according to the player feedbacks after the game launched is almost ignored. The design of the in-game system needs to be dynamically changed according to the player feedback, rather than being unchanging. Hence, game analytics used for driving game development still needs in-depth research. Besides this, during the game development process, the potential problem that developers facing is that before the game is launched, it is hard to make sure the players will recognize the original game design. So, how to use data analysis to evaluate whether the design of the game system meets the player requirements and avoid some design mistakes is also worthy of doing further research.

6.3 Game publishing analytics challenge

According to the literature review, game analytics related to game publishing research is not comprehensive. At present, most research mainly focuses on the game retention and revenue analytics side. There is no detailed analytics about the entire process of game publishing. For example, keep releasing the new game content is essential for game publishing, but how to evaluate the game new version update performance is unknown. How to do the marketing based on data analytics and how to maintain the active player and deliver more revenue for the launched game still need further research. The lack of such kind of game publishing analytics makes it hard to form an effective game optimization after the game launch. It may also lead to game publishing failures. So, this part of the research is vital and valuable to pursue. Besides this, with the emergence of App Store and Google Play and similar kinds of third-party distribution channels, indie game developers who have fewer resources for the game development can submit their games to these channels and publish games themselves. Hence, how to help these indie game developers guide their game publishing by game analytics also needs further research.

6.4 Distribution channel analytics challenge

According to our literature review, at present, channel analytics is ignored by most game analytics researchers. However, for games, its distribution channels differ from other marketing channels as game channels have their own attributes. For example, players from the iOS App Store channel are quite different from the Google Play channel, which results in various performances of the same game in these two different channels. The attributes of games also have a significant influence on the distribution channels. In addition, the channel attributes and benchmarks are also essential factors that restrict game distribution from the game industry side. Channel analytics is based on big data from different channels. So how to get these data is a potential challenge for research. Besides this, according to the relationship between game analytics and also game metrics, game metrics can be used to measure the changes and evaluate the problems by game analytics. However, there is a lack of corresponding metrics and analysis, especially for the game channel side, which also brings the channel analytics problems. In order to highlight the critical role of channels for game distribution and promotion, more research needs to focus on the field of channel distribution, especially how to use data analytics to obtain target players from channels and reduce the cost of game promotion.

6.5 Game prediction analytics challenge

At present, game prediction research mainly focuses on predicting player churn and game revenue. However, the prediction of game revenue primarily focused on predicting player purchase behavior, which lacks a useful analysis of the prediction of game revenue based on historical revenue data. In practice, game developers face issues on how to do a revenue forecast for their games during the game publishing process. Based on the revenue forecast, they can make a plan for marketing promotion, such as how much marketing budget needs to be used for different channels for new user acquisition based on the Return on Investment (ROI). It is also possible to follow up on game development costs and set up benchmarks to evaluate game publishing performance. However, making the revenue forecast is hard, especially for indie game developers who have fewer resources to do game development and have little or no revenue forecast experience. The prediction of game revenue, especially how to estimate future revenue based on the historical game revenue data, deserves in-depth research.

6.6 Game data visualization challenge

At present, most game data visualization studies mainly focus on providing tools to collect and represent game data, such as a visual representation of player data, displaying all the data through visual information, and enabling analysts to interact with collected data. However, few research studies focus on the visual data provided by the game itself, such as the in-game data visualization. As the visualization of in-game data can help players become familiar with the game and enhance the game experience. In the future, how to provide players with a visualized gaming experience inside the game by game analytics is also worthy of doing in-depth research.

7 Discussion

Based on our comprehensive literature review, the detailed answers to the five questions we raised before are discussed and answered.

First, as for RQ1, regarding game analytics, the aspects have been explored in the available literature, we give the preliminary classification and the overview of different research areas based on the traditional game value chain [10]. As shown in Fig. 8, it includes game player analytics, game development analytics, game publishing analytics, game distribution analytics, game prediction, and data visualization, which are orthogonal to the traditional game value chain. Then, we innovatively use these categories to structure the presentation of review results.

Second, as for RQ2, the main purposes of using analytics in the game industry, through our literature review, Koskenvoima and Mäntymäki [70] make the survey with small and medium-size freemium game developers about the reasons why they use game analytics. They analyze the collected data through in-depth interviews with a group of small and medium-sized game developers. Research results show that the main reason for using game analytics including three parts. First, game analysis can be used to assist the development and game design. Second, based on data analysis, the game developer can effectively reduce the risk of game development and publishing. Third, through game analytics, the developers can negotiate with investors and publishers. Flunger et al. [71] focus on analytical and predictive models for F2P games. They discuss game analytic use in the F2P game, especially for helping the game developers with game player churn prediction and player lifetime value prediction. Besides this, based on the traditional game value chain [10], we also find that game analytics in the game industry can be used to improve the efficiency of game development and to increase the game revenue and guide the game optimization and also the game publishing.

Third, as for RQ3, the problems or challenges in the game area can be addressed by using game analytics. Based on our reasonable classification and literature review, we can see that the game analytics can be used not only in the game industry value chain to solve the problem of game industry chain each link, but also can be used in the field of game data visualization and game prediction. Then combined with game industry requirements, we focus on the problems or challenges discussing in Sect. 6. It includes player analytics, game development analytics, game publishing analytics, distribution channel analytics, and also the game prediction, and also the game data visualization problems, and challenges.

Fourth, as for RQ4the algorithms have been used in game analytics for prediction, we summarize the main algorithms which can be used for analyzing game data and making the prediction. As discussed in this paper, game analytics can predict trends, understand player churn rate, and predict game revenue. There are currently different algorithms that can be used for churn prediction. Related research mainly includes Decision Trees, Random Forest, Support Vector Machines, Neural Networks, and the Hidden Markova Models [59]. However, compared with the churn prediction, making the revenue forecast is hard. The prediction of game revenue, especially how to estimate future revenue by using time series prediction algorithms need to do in-depth research.

Finally, as for RQ5, the research areas have already been covered in literature, but still, need further development. This paper also presents details about the potential research gaps ignored by many researchers especially for game publishing analytics and also game distribution channel analytics [56]. Besides this, how to improve the player analytics for different game contents, how to do the game development analytics and give a valuable suggestion about game optimizations, how to keep the in-game economy balance by game analytics, and how to do the publishing analytics to extend the game lifetime cycle still need to do further research.

In addition, according to our literature review, there are still such potential research gaps in game analytics. On the one hand, from the game industry side, little game analytics research can help promote the game BI knowledge sharing or standardization. As confidentiality data such as revenue and churn, and retention make the knowledge sharing difficult. That is also the potential reason why game analytics research is currently fragmented [7]. This is to be expected in the explorative phase of a new domain being established, especially for the game publishing and game distribution channels research. On the other hand, as for game analytics, after the game launched, many data statistics are recognized as the secrets of companies. So, the game industry has concerns about collaboration with academia. Therefore, game analytics used in the game industry lacks a theoretical basis. That is also the main reason for the potential gaps between the game industry and academic research.

8 Conclusion

In this paper, a comprehensive literature review and a more detailed classification of game analytics are provided. As the traditional game value chain covers the whole game industry, we innovatively adopt it as a classification criterion and use it as a starting point to determine the basic dimensions of our coding system. By following this traditional game value chain, it is reasonable for game analytics to include at least four parts: game development analytics, game publishing analytics, game distribution channel analytics, and game player analytics. With the in-depth study of our literature review, we found that many research types also focus on data visualization and game prediction in the collected literature. Therefore, we added two parts of game analytics research on data visualization and prediction as the orthogonal to the traditional game value chain. Besides this, game as a service means providing the player game content on a continuing revenue model similar to software as a service. From game as a service side, it is also meant to use BI in the game industry. It can provide a service-oriented decision system through data analysis to guide the whole game industry, especially for the game publishing analytics, which can help acquire players, maintain players, and maximize game revenue effectively.

The main contribution of this paper includes four parts. First, based on the game industry value chain, we provide a comprehensive literature review about game analytics. Second, based on the coding system, we overview the current research status and point out the potential research trends about game analytics. Third, we also discuss the main purposes of using game analytics in the game industry and the related algorithms used for game prediction. Finally, we present the research gaps and also potential reasons why these research gaps exist. This research is valuable as a baseline for future research in this area.