Introduction

The issue of whether games with aggressive or violent content (henceforth called aggressive video games, AVGFootnote 1) contribute to aggression or violence in society remains an issue of significant controversy worldwide. In the United States, debates culminated in the Supreme Court decision Brown v EMA (2011) wherein the court majority concluded that evidence could not link aggressive video games to societal harms. This has not ended debates, however, which tend to become most acute following public acts of violence, particularly by minors (Copenhaver 2015; Markey et al. 2015). One concern that has been raised is that many previous studies have not been sufficiently rigorous, employing unstandardized measures (Elson et al. 2014), failing to control for theoretically relevant third variables (Savage and Yancey 2008) or for potential questionable researcher practices such as calculating predictor or outcome variables differently between publications using the same dataset (Przybylski and Weinstein 2019). The current article seeks to address these issues through reanalysis of a dataset employing preregistration, theoretically relevant controls and a clear and standardized method for assessing both predictor and outcome variables.

Aggressive Video Games Research

Decades of research on aggressive video games has failed to produce either consistent evidence or a consensus among scholars about whether such games increase aggression in young players. Indeed, several surveys of scholars have specifically noted the lack of any consensus (Bushman et al. 2015Footnote 2; Ferguson and Colwell 2017; Quandt 2017). According to some of these surveys, opinions among scholars also divide along generational lines (older scholars, particularly those who play no or fewer games, are more suspicious of game effects), discipline (psychologists are more suspicious of game effects than criminologists or communication scholars) and attitudes toward youth themselves (scholars with more negative attitudes toward youth are more suspicious of games.).

Regarding violence related outcomes, evidence appears to be clearer than for milder aggressive behaviors. As noted in a recent US School Safety Commission report (Federal Commission on School Safety 2018) research to date has not linked aggressive video games to violent crime. Indeed, government reports such as those from Australia (Australian Government, Attorney General’s Department 2010) and Sweden (Swedish Media Council 2011) as well as the Brown v EMA (2011) case have been cautious in attributing societally relevant aggression or violence to aggressive video games. Other research has indicated that the release of aggressive video games may be related to reduced violent crime (Beerthuizen et al. 2017; Markey et al. 2015). The most reasonable explanation for this is that popular aggressive video games keep young males busy and out of trouble, consistent with routine activities theory.

On the issue of aggressive behaviors, both evidence and opinions are more equivocal. Several meta-analyses have concluded that aggressive video games may contribute to aggressive behaviors. (e.g. Anderson et al. 2010; Prescott et al. 2018). However, reanalysis of Anderson et al. (2010) has suggested that publication bias inflated outcomes, particularly for experimental studies (Hilgard et al. 2017). For Prescott et al. (2018), it is less clear that the evidence supports the authors’ conclusions. Only very small effect sizes were found (approximately r= 0.08). Most included studies relied on self-report and unstandardized measures and were not preregistered, increasing potential for spurious findings. By contrast other meta-analyses (e.g. Ferguson 2015a; Sherry 2007) have not concluded sufficient evidence links aggressive video games to aggressive behaviors. These meta-analyses also have resulted in disagreements and criticisms (e.g. Rothstein and Bushman 2015) although the Ferguson (2015a, 2015b) meta-analysis was also independently replicated (Furuya-Kanamori and Doi 2016). Nonetheless, significant disagreements remain among scholars about which pools of evidence are most convincing. The American Psychological Association (APA) has concluded that aggressive video games are not related to violence but may be related to aggression (American Psychological Association 2015) but the APA statement also was critiqued for flawed methods and potential biases (Elson et al. 2019).

Critiques of Aggressive Video Game Research

Disagreements among scholars stem from concerns regarding several issues. These include systematic methodological issues that may influence effect sizes, and the interpretability of those effect sizes and their generalizability to real-world aggression. Critiques of laboratory-based aggression studies have been well-elucidated elsewhere (McCarthy and Elson 2018; Zendle et al. 2018). As the current article focuses on longitudinal effects, this review will focus on that area.

At present, perhaps two dozen longitudinal studies have examined the impact of aggressive video games on long-term aggression in minors (e.g. Breuer et al. 2015; Lobel et al. 2017; von Salisch et al. 2011). Results have been mixed, with effect sizes generally below r= 0.10. However, these studies vary in quality. Some do not adequately control for theoretically relevant third variables (such as gender; boys both playing more aggressive video games and more physically aggressive than girls). Concerns have been raised about the unstandardized use of both predictor and outcome variables, such that these variables have been constructed differently between articles by the same research group using the same dataset (Przybylski and Weinstein 2019). This raises the possibility of questionable researcher practices that may be inflating effect sizes. This also raises the possibility that effect sizes in meta-analyses may be inflated in ways that are difficult to detect via traditional publication bias tests. Other issues involve the use of ad hoc measures, which lack standardization or clinical validity, making interpretation of the results difficult.

In addition to the methodological concerns there are also, as noted, disagreements about the interpretability of tiny effect sizes even when “statistically significant”. For decades, it has been understood that relying on statistical significance can produce interpretation errors (Wilkinson and Task Force for Statistical Inference 1999). This is particularly true in large sample size studies, wherein increased power can cause noise or “crud factor” (herein defined as spurious correlations caused by common methods variance, demand characteristics, or other survey research limitations) to become statistically significant, despite having no relation to real-world effects. Thus, the potential for overinterpretation of tiny effect sizes from large sample size studies is significant, and the Type I error rate of such effects is likely high. As such, some scholars have suggested adopting a minimal threshold for interpretation of r= 0.10 in order to minimize the potential for overinterpretation of spurious findings from large studies (Orben and Przybylski 2019a).

The potential for overinterpretation of crud factor results is particularly relevant to meta-analysis. For instance, one recent meta-analysis (Prescott et al. 2018) concluded that aggressive video games are linked longitudinally to aggression based on a very weak effect size (r= 0.08). The basis of this decision seems to have been this effect was “statistically significant” despite heterogeneity in findings among the individual studies. However, owing to highly enhanced power, almost all meta-analyses are statistically significant, so using this as an index of evidence is dubious. Such tiny effects may not reflect population effect sizes but may be the product simply of systematic methodological limitations and demand characteristics of the included studies.

One approach to examine whether tiny effect sizes are meaningful has been to compare them to nonsense relationships. In other words, compare effect sizes for the relationship of interest (in this case aggressive video games and player aggression) to effect sizes for the theoretical predictor variable (aggressive video games) on outcomes theoretically unrelated (or vice versa, the theoretical outcome with nonsense predictors), where relationships are expected to be practically no different from zero. Orben and Przybylski (2019b) did this with screen time and mental health. Examining several datasets, they demonstrated that, in large samples, screen time tended to produce very tiny but statistically significant relationships with mental health. However, these were no different in magnitude than several nonsense relationships such as the relationship between eating bananas and mental health or wearing eyeglasses on mental health (both of which were also statistically significant.) By making such comparisons, it is possible to come to understanding of whether an observed statistically significant effect size is meaningful, or likely an artifact that became statistically significant due to the increased power of large samples.

Theoretically Relevant Control Variables

As noted earlier, it is considered the gold standard of media effects research to ensure that theoretically relevant third variables are adequately controlled in multivariate analyses (Przybylski and Mishkin 2016; Savage 2004). Without doing so, bivariate correlations are likely to be spuriously high and misinform. The most obvious third variable is gender, given higher rates of both aggressive video game play and physical aggression in boys (Olson 2010). Without controlling for gender, any correlation between aggressive video games and aggression may simply be a feature of boyness.

The need for proper control variables can be informed by the Catalyst Model (Ferguson and Beaver 2009; Surette 2013) which is a diathesis-stress model of violence. This model posits that violence propensity results from genetic inheritance coupled with early environmental influences, particularly family environment. These lead to development of a personality style particularly prone to aggressiveness and hostile attributions. Decisions whether to engage in violence or aggression can be further hampered by difficulties with self-control. From this theoretical perspective, controlling for variables such as family environment, early aggressiveness and issues related to self-control and impulse control are important.

Thus, control variables have been generally well lain out for aggressive video game studies. These typically include the Time 1 (T1) outcome variable, as well as variables related to family environment (Decamp 2015), self-control and impulsiveness (Schwartz et al. 2017) as well as intelligence (Jambroes et al. 2018). Multivariate analyses with proper controls can help elucidate the added predictive value of aggressive video game play above well-known risk factors for increased aggression.

The Singapore Dataset

The current study consists of a reanalysis of a large dataset from Singapore (henceforth simply “Singapore dataset”) that has been used several times previously (see Przybylski and Weinstein 2019 for full listing and discussion, pp. 2–3). The validity of previous studies using this dataset have been questioned (Przybylski and Weinstein 2019). This is not because the dataset is inherently poor quality, but rather that variables, and particularly the aggressive video game variable, had been calculated differently across publications by the same scholars. For instance (see Ferguson 2015b), using the Singapore dataset violent game exposure has been calculated by: 1.) multiplying self-rated violent content by hours spent playing for three different games, and averaging scores (Gentile et al. 2009), 2.) a 4-item measure of violence exposure in games with no reliability mentioned (Gentile et al. 2011), 3.) changing the 4-item measure to a 2-item measure with mean frequency calculated across three games with no involvement of time spent playing (Busching et al. 2013), 4.) a 9-item scale comprised of gaming frequency, three favorite games with violent and prosocial content (Gentile et al. 2014), and 5.) a 6-item scale also comprising gaming frequency, three favorite games and 2-item violent content questions (Prot et al. 2014). In some studies, the authors do not provide enough information to understand how the video game variables were created and whether violent and prosocial video game questions were treated separately or combined (e.g., Gentile et al. 2014). This phenomenon, often described as the “garden of forking” paths greatly enhances Type I error by potentially allowing researchers the freedom to manipulate outcomes to fit hypotheses by allowing undesired degrees of researcher freedom (Gelman and Lokens 2013).

This has raised concern that questionable researcher practices may have caused false positive results from some studies linking aggressive video games to long-term aggression. Related, the dataset includes multiple measures of aggressive and prosocial behavior, but not all were reported in each article. Creating a standardized measurement for aggressive video games and using it consistently with this dataset can reduce false positive results. Careful use of theoretically relevant control variables was also lacking in many published studies, also potentially resulting in false positive results. Lastly, none of the previous studies were preregistered. Thus, there is value in conducting a reexamination of this otherwise fine dataset using a preregistered set of analyses and standardized assessment of key variables, to examine the validity of prior conclusions.

The Current Study

The present study reassesses links between aggressive video games and aggression in a large sample of youth from Singapore. These analyses test the straightforward hypotheses that aggressive video games are related to increased aggression and decreased prosocial behaviors. Seven outcome variables were preregistered, namely: Prosocial Behavior, Physically Aggressive Behavior, Socially Aggressive Behavior, Aggressive Fantasies, Cyberbullying Perpetration, Trait Anger, Trait Forgiveness.

This analysis used several approaches to reduce Type I error results in several ways. First, this analysis has been preregistered (the preregistration can be found at: https://osf.io/2dwmr.) It is certified that the authors preregistered these methods and analysis before conducting any analyses with the dataset. Second, standardized assessments are used for all variables. The aggressive video games variable is calculated in a way typical for most aggressive video game studies and is detailed specifically. Any further analyses or studies using this dataset should use this standardized approach and not vary from it. All other measures used full scale scores unless detailed otherwise. Third, theoretically relevant control variables were preregistered and employed. Lastly, all relevant outcome variables related to aggression and prosocial behavior are reported in this article. All outcome variables were preregistered prior to any analyses. No analyses were excluded or included specifically based on outcome, statistical significance, etc. The current article uses the 21-word statement suggested by Simmons et al. (2012, p. 4): “We report how we determined our sample size, all data exclusions (if any), all manipulations, and all measures in the study”.

As noted, effect sizes have often been very small in aggressive video game research, and their meaningfulness is debated. One way to examine for the meaningfulness of effect sizes is to compare hypothesized effect sizes to nonsense effect sizes. That is to say, effect sizes for variables not thought to be practically related to aggressive video games. If nonsense outcomes and aggression/prosocial outcomes are of similar effect size magnitude, this is further argument that such effect sizes should not be interpreted as meaningful, even if statistically significant. This approach was pioneered by (Orben and Przybylski 2019b) related to screen time. Further, as recommended by Orben and Przybylski (2019a), an effect size cut-off of r= 0.10 will be employed as the threshold for minimal effects of interpretive value.

Methods

Participants

Participants in the current study were 3034 youth from Singapore. Of the sample 72.8% reported being male. Mean age at time 1 (T1) was 11.21 (SD= 2.06). Mean age at time 3 (T3) was 13.12 (SD= 2.13). The majority of the sample were ethnic Chinese (72.6%), with smaller numbers of Malay (14.2%), Indian (8.7%) and others. This is consistent with the ethnic composition of Singapore. As indicated above, participants were surveyed three times at 1-year intervals.

Materials

All measures discussed below were Likert-scale unless detailed otherwise. Also, full scale scores were averaged across individual items unless otherwise indicated for each measure. All control or predictor variables were assessed at T1 unless otherwise noted, whereas all outcome variables were assessed at T3 unless otherwise noted.

Aggressive video games (AVGs, main predictor)

Assessment of video game exposure can be difficult to do reliably and, as noted above, one concern with past use of this dataset is that assessment of aggressive video games in part studies demonstrated the potential for questionable researcher practices (Przybylski and Weinstein 2019). The current study adopted a standard approach to assessing aggressive video game exposure (Olson et al. 2007). Participants were asked to rate 3 video games they currently played and how often they played them both on weekdays and weekends. The researchers obtained ESRB (Entertainment Software Ratings Board) ratings for each of the games, which have been found to be a reliable and valid estimate of violent content (Ferguson 2011). For each game, the ordinal value of the ESRB rating (1 = ‘EC’ through 5 = ‘M’) was multiplied by average daily hours played. An average of these composite scores for the three games was then computed.

It is noted that this method for computing the scores was preregistered before any data analysis and was not changed from the preregistration. Second, it is certified that any future articles using the aggressive video game variable will maintain these calculated scores. Lastly, it is advised that other authors using this dataset stick to this standardized method of computing aggressive video games for consistency and to avoid questionable researcher practices. Though no special claim to brilliance is made in devising the best possible scale, using this scale consistently across papers can reduced Type I error due to methodological flexibility and make comparisons across papers more consistent.

Demographics (control variables)

Sex, age at T1 and mother’s reported years of education were used as basic control variables.

T1 aggressiveness (control variables)

In longitudinal analyses it is important to control for the T1 variable in order to limit potential selection effects. In this case, the main outcome variables related to aggressive behavior were not assessed at T1, so to employ a consistent set of T1 selection controls, two variables assessed at T1 related to aggressiveness were employed. These include the Normative Beliefs in Aggression Scale (NOBAGS, Huesmann and Guerra 1997). This was a 20-item scale (alpha = 0.935), that asks youth whether use of aggression is acceptable in varying circumstances. The second measure was a scale for Hostile Attribution Bias (Crick 1995) which presented youth with six ambiguous scenarios and asked youth to rate the aggressive intent of characters in each scenario (alpha = 0.643). Taken together, these two measures appear to function adequately to assess aggressiveness at T1.

T1 self-control (control variables)

Given evidence that self-control is associated with aggressive behavior (Schwartz et al. 2017), two measures of initial self-control were included as controls. These included a 6-item measure of self-control (alpha = 0.620), which included items related to handling stress and losing temper, as well as a 14-item measure of impulse control problems, which assessed inattentiveness, impulsive behaviors and excitability (Liau et al. 2011).

T1 intelligence (control variable)

The Ravens Progressive Matrices were used to assess non-verbal intelligence in the youth at T1. The Ravens has generally been found to be a reliable and valid measure of intelligence across cultures (e.g. Shamama-tus-Sabah et al. 2012), although comparisons between cultures may not be advised. Given intelligence is an important factor in serious aggression (Hampton et al. 2014) it was considered important to control for. Full scale scores were used.

Family environment (control variable)

Given evidence family environment can influence aggression (DeCamp 2015), a six-item measure of family environment was included (alpha = 0.772; Glezer 1984). Items asked about whether youth felt it was pleasant living at home, whether they felt accepted or whether there were too many arguments.

Prosocial behavior and empathy (T3 outcome, T1 control)

Prosocial behavior and empathy were assessed using the helping and cooperation subscales (18 items, alpha = 0.827 at T1, 0.834 at T3) of the Prosocial Orientation Questionnaire (Cheung et al. 1998). Items asked about willingness to help or volunteer such as “I would help my friends when they have a problem.” This variable was assessed as a T3 outcome. For that analysis only the T1 variable was included as an additional control variable.

Aggressive behavior (outcome)

Aggressive behavior was assessed using a measure that included both physical (6 items, alpha = 0.869) and relational (6 items, alpha = 0.796) aggression (Linder et al. 2002; Morales and Crick 1998). Physical aggression asked about assaultive behaviors such as “When someone makes me really angry, I push or shove the person” whereas relational aggression was more social in nature rather than physical “When I am not invited to do something with a group of people, I will exclude those people from future activities.” These were assessed as separate outcome measures.

Aggressive fantasies (outcome)

Aggressive fantasies were measured using a 6-item scale (alpha = 0.839) that assessed whether youth harbored fantasies about harming others (Nadel et al. 1996). An example item is “Do you sometimes imagine or have daydreams about hitting or hurting somebody that you don’t like?”

Cyberbullying (outcome)

Cyberbullying perpetration was assessed using six items related to whether youth had been rude to, spread rumors about or threatened others on the internet (alpha = 0.888; Barlett and Gentile 2012).

Trait anger (outcome)

To assess for trait anger, a 6-item scale was employed (alpha = 0.823; Buss and Perry 1992) to assess the degree to which youth felt ongoing anger or reacted to anger badly. A sample item is “I have trouble controlling my temper.” A seventh item (#4) was found to have poor reliability with the other items and was not included in the averaged scale score. This decision was made prior to any data analysis.

Trait forgiveness (outcome)

Trait forgiveness was assessed with a 10-item scale (alpha = 0.668; Berry et al. 2005), which asked about willingness to be merciful or forgiving of others who had done the youth harm. A sample item is “I try to forgive others even when they don’t feel guilty for what they did.”

Nonsense outcomes

Several nonsense outcomes were chosen for lack of theoretical link between them and aggressive video game exposure. These included T3 height, T2 myopia (the only variable taken from T2 as this was not available at T3), age the youth moved to Singapore (if they were not born there) and whether the youth’s father was born in Singapore. Two scale scores were also included, a 17-item scale related to T3 social phobia (alpha = 0.920) and a 10-item scale related to somatic complaint such as back pain, headaches, etc., at T3 (alpha = 0.878). A PsycINFO subject search for “violent video games” and “social phobia” turned up 0 hits. A similar search using the term “somatic” likewise turned up 0 hits. Therefor it appears reasonable that these two scale scores are suitable nonsense outcomes with little theoretical link to aggressive video games.

Procedures

Participants in the study were 3034 students from the 6 primary schools and 6 secondary schools in Singapore. The longitudinal aspect of the study involves following this cohort over the three-year period. The second wave of the longitudinal survey study was conducted a year after the first wave. Procedures were similar to Wave 1. The third wave of the longitudinal Survey study was conducted a year after.

Four sets of counterbalanced (e.g. presented in differing orders to reduce ordering effects) questionnaires were delivered to all the schools. Letters of parental consent were sent to the parents through the schools. A liaison teacher from each school collated the information and excluded students from the study whose parents refused consent. The questionnaires were administered in the classrooms with the help of schoolteachers at the convenience of the schools. Detailed instructions were given to schoolteachers who helped in the administration of the survey.

Students were told that participation in the survey was voluntary and they could withdraw at any time. Privacy of the students’ responses is assured by requiring the teachers to seal collected questionnaires in the envelopes provided in the presence of the students. It was also highlighted on the questionnaires that the students’ responses would be read only by the researchers.

In the second and third years of the project, students who had to be followed-up were no longer in the classes together with their previous cohorts but were in distributed in different classes together with other students who did not participate in the project.

All schools involved preferred to administer the questionnaires by classes rather than have the selected students taken out of their classes for the study. As a result of this administrative convenience, students not involved in the project were also surveyed.

All analyses were preregistered. Control variables were consistent across analyses, with the exception of including T1 prosocial/empathy when assessing T3 prosocial/empathy. All regressions used OLS with pairwise deletion for missing data. Analyses of VIF revealed lack of collinearity issues for all analyses, with no VIF outcomes reaching 2.0.

Results

A correlation matrix of variables is presented as Table 1. Note, all regression models were significant at p< 0.001, including for nonsense outcomes, except for father’s birthplace which was significant at p= 0.003.

Table 1 Correlation matrix

Main Study Hypotheses

Standardized regression coefficients are presented for all main study outcomes in Table 2. For none of the outcomes was aggressive video game exposure related to aggression or prosocial related outcomes. Although no single predictor was significant across all outcomes, the most consistent predictors of outcomes included female sex (as a protective factor), positive family environment (as a protective factor) and initial problems with impulse control (as a risk factor). Prosocial behavior was also largely consistent across time.

Table 2 Main hypotheses regression outcomes at T3

Results for nonsense outcomes are presented in Table 3. Surprisingly, exposure to aggressive video games was a significant predictor of earlier age moved to Singapore. As there is no theoretical reason for such a relationship, this highlights how statistically significant outcomes with even non-trivial effects can sometimes be reported, which may be over interpreted by scholars favoring their hypotheses.

Table 3 Nonsense variable regression outcomes

The mean of the absolute value of effect sizes for aggressive video game exposure on hypothesized outcomes was r= 0.032. The mean of the absolute value of effect sizes for nonsense variables was actually higher at r= 0.039. If the largest value for the nonsense outcomes is removed this reduces the effect size for the nonsense variables to r= 0.022. However, eliminating the largest value from the hypothesized outcomes likewise reduces the mean effect size to r= 0.023. Thus, it appears likely that the effect sizes for the hypothesized effects and nonsense effects are equivalent in approximate value.

Exploratory Analysis not in Preregistration

To examine for methods variance issues, all regressions were rerun with listwise deletion for missing data rather than pairwise. Results did not substantially change, suggesting that methods variance issues are not in play with the results. Effect sizes for some outcomes (such as cyberbullying) were slightly smaller for listwise deletion, but pairwise deletion results are shown in the table, consistent with the preregistration.

Another means by which to consider the practical value of a predictor is to examine how much of that predictor would be required to achieve a clinically observable effect in real life. Orben and Przybylski (2019b) pioneered this approach using screen time and mental health outcomes. In clinical work a clinically significant outcome is typically defined as approximate 1 SD above the mean (more generously for the hypothesis a 0.5 SD threshold could also be applied). Then unstandardized regressions can potentially be used to calculate how much of the predictor variable is required to push the outcome variable to observable clinical significance.

This is only possible if the predictor variable itself exists in observable metrics such as time. Thus, Orben and Przybylski were able to calculate how many hours per day of screen time was required to create a clinically observable effect on mental health in youth. However, aggressive video game exposure as a combined measure of time and violent content does not really work effectively in this sense. Thus, a new variable was created using only M-rated (the highest rating for commercially sold games) games, calculating time spent playing M-rated games specifically. This allowed calculating a mean hours/day figure for such games. Physical aggression was used as the main outcome, as this was likely the outcome of greatest interest. For this variable the mean value was 1.524, on a range of 1 through 4 (SD= 0.593). Thus, a 1 SD increase would be 2.117, whereas a 0.5 SD increase would be 1.821.

The regression for the physical aggression outcome was then rerun replacing aggressive video game exposure with time spent (hours/day average) on M-rated video games. As with the preregistered regression, the result was non-significant for M-rated game use (β = 0.022). However, if non-significance is ignored and it is assumed that this effect size might nonetheless be meaningful, then the unstandardized regression coefficient (b = .022, SE = 0.023) can be used to calculate clinical significance. Thus, a daily hour spent on M-rated video games would result in an increase of 0.022 in the measure of physical aggression. By this metric it would take 27 h/day of M-rated video game play to raise aggression to a clinically observable level, assuming effects were causal (13.5 h, for half a standard deviation).

Discussion

Controversy continues regarding whether aggressive video games contribute to aggression in real life. Neither individual longitudinal studies, nor meta-analysis have come to a conclusion regarding whether real-life effects exist. In some case, undue flexibility in analytic methods may have created false positive results (Przybylski and Weinstein 2019). To assess for this, the current article examined data from a large longitudinal study of youth in Singapore using preregistration and standardized measures. Current results found that aggressive video game exposure was not linked to either aggressive behavior or prosocial behavior two years later among youth. Regarding clinical significance, current results suggest that it would require more hours of M-rated game play to produce clinically significant aggression than exist in a day. Therefore, data from this study do not suggest that aggressive video games contribute to real-world aggression.

These results fit with numerous other recent longitudinal analyses (e.g. Breuer et al. 2015; Lobel et al. 2017; von Salisch et al. 2011) that have found no long-term predictive relationship between aggressive video games and future aggression in youth. To the extent that youth aggression is multi-determined, aggressive video game exposure does not appear to be one of the risk factors for such outcomes. Quote such as “Violent video games are just one risk factor. They’re not the biggest, and they’re not the smallest. They’re right in the middle, with kind of the same effect size as coming from a broken home,” (Gentile, quoted in Almendraia 2014) appear to be entirely incorrect. Aggressive video game playing does not appear to be a risk factor for future youth aggression at all and certainly should not be compared to the influence of broken homes. It is argued that researchers need to be far more cautious in communicating longitudinal effects for aggressive video games to the general public. Overall, evidence does not appear to support such a link. The current study not only adds to this evidence but reanalyzes evidence that sometimes was used to support such claims. With preregistration and proper controls, it is clear that the Singapore dataset should not be considered evidentiary in support of long-term aggressive video game influences on youth. Given few longitudinal studies provide effect sizes above r= 0.10 for any form of deleterious effect, claims for long-term harms from aggressive video game exposure have simply not been substantiated.

The current analyses have several implications. The first is for meta-analyses. Most meta-analyses compile effect sizes from reported articles under the assumption that the reported effect sizes are reasonably accurate and representative of population effect sizes. However, as indicated above, flexibility in methods and unstandardized assessments may cause spuriously high effect size estimates (Przybylski and Weinstein 2019) causing errors in meta-analysis. Recent preregistered studies of aggressive video game effects of which there are perhaps half a dozen have generally not found evidence for negative effects (e.g. McCarthy et al. 2016; Przbylski and Weinstein 2019, although see Ivory et al. 2017 for one high-quality exception). Thus, most extant meta-analyses may be compounding the issue of spurious effects reported in individual studies.

The second issue comes regarding the interpretation of potentially trivial effects. In many studies, including this one, effect sizes reported are below r= 0.10. Nonetheless, with large sample sizes, these may become statistically significant. The current analysis suggests that relying on statistical significance is likely to cause spurious interpretation of trivial effects. In the current analysis, the effect sizes for aggressive video game exposure predicting nonsense outcomes was equivalent to that for predicting aggression or prosocial outcomes. Similar results have been found in other studies which have examined this issue (e.g. Orben and Przybylski 2019b). These findings support the concern that the risk for Type I error results in large samples with small effect sizes is intolerably high, often resulting in misinterpretation of findings that do not, in fact, provide evidence for study hypotheses. Given that many such outcomes will have p-values much lower than .05, it is possible that traditional publication bias practices may have difficulty detecting spurious outcomes, even if they are the result of questionable researcher practices as has been noted for previous articles using this dataset (Przybylski and Weinstein 2019). Thus, the current article supports Orben and Przybylski (2019a) in recommending against interpreting effect sizes below r= 0.10 at least in this domain.

It is worth noting some of the predictors that were significant. Both female gender as well as positive family environment were protective factors whereas impulse control problems were risk factors for negative outcomes. Thus, public policies that aim toward strengthening families as well as increase youth impulse control are likely to be more productive than those that target video games.

Developmental Implications

Much of the previous few decades of scholarship have evolved with a tacit understanding that children act as passive imitators, with little distinction in their modeling between real-life and fictional events. This has led to sometimes sweeping conclusions about the harmfulness of a variety of media experiences, not limited to violent content. Perhaps most notable related to video games was the APA’s recent (2015) resolution connecting aggressive video games to aggression in real life (though not violent crime.)

Increasingly, however, research, particularly that which is preregistered and standardized, has had difficulty finding evidence that exposure to fictional media and aggressive video games specifically is connected to the development of more aggressive profiles among youth. These newer results suggest that media experiences for youth may be more nuanced and complex than simply connecting “naughty” media to negative outcomes. The current study joins this expanding pool of research in suggesting that resolutions such as that by the APA are not consistent with the cumulative pool of preregistered studies using standardized measures (e.g. Przybylski and Weinstein 2019). Or put simply, the APA resolution on aggressive video games does not reflect current best science.

This has important implications for policy insofar as that policies that are aimed at reducing youth exposure to aggressive video games are unlikely to result in positive developmental outcomes. However, such policies may come with significant costs, including restrictions on freedom of speech, limiting youth creative experiences, stigmatizing the use of games in education, and stigmatizing gaming as a hobby and gamers as a community. With little evidence to suggest that policies geared toward reducing aggressive video game exposure are likely to have positive practical outcomes, such policy efforts are not recommended in the future.

Limitations

As with all studies, ours has limitations. All measures were youth self-report. Self-report measures are not always fully reliable and can be subject to single-responder bias. Further studies using multiple responders would be desirable. Data in the current study is correlation and no causal attributions can be made. Lastly, determining a valid measure of aggressive video game exposure based on self-report can tend to be difficult. Here the current study used a standardized and replicable approach which is an improvement upon some previous approaches. However, quantifying aggressive video game exposure by using time spend on multiple games can cause some measurement error.

Conclusion

The issue of the impact of aggressive video games on youth aggression continues to be debated. There appears to be some confusion among scholars (e.g. Prescott et al. 2018) regarding whether current evidence supports long-term links between aggressive video games and youth aggression, despite most longitudinal studies failing to demonstrate robust results. The current article presents a preregistered, standardized assessment of aggressive video game effects using a large sample of Singapore youth. Results indicate that using a standardized measurement approach that was preregistered, this dataset does not support the hypothesis that aggressive video games are a risk factor for aggression in youth. Given some previous issues with researcher degrees of freedom in previous reports (see Przybylski and Weinstein 2019) for discussion, it is recommended that the current reported effect sizes be used to represent this dataset. The current analyses contribute to a growing number of studies that call into question whether aggressive video games function as a meaningful predictor of aggressive or prosocial behavior. It is hoped that this data furthers the ongoing debate on this issue.