1 Introduction

Along with incentive schemes, another well-established way to align the interests of principals and agents and, consequently, to reduce and ultimately eliminate biases and errors is the practice of monitoring. Further, modern technological advances have extended organizations’ monitoring possibilities, and monitoring is becoming increasingly widespread (Alder 2001; Fusi and Feeney 2018). Tools are tailored to specific tasks and include video cameras, website tracking, badges with electronic transmitters, global positioning satellite technologies and computer keystroke monitoring (Tabak and Smith 2005; Chen and Ross 2007).

While the theoretical mechanism of monitoring is well understood, its effectiveness and implications are typically controversially debated (Kelly 2001). According to standard neoclassical principal-agent theory, the additional control over agents provided by monitoring should prevent agents from shirking and reinforce behaviour that is in line with the principal’s preferences. Yet, the literature indicates that monitoring may not always have this intended effect (e.g., Falk and Kosfeld 2006); instead, monitoring might generate mistrust among agents and, thus, dampen and crowd out the actual effect that principals desire (Ellingsen and Johannesson 2008; Frey 1993; Nagin et al. 2002; Falk and Kosfeld 2006). It is likely that excessive surveillance restrains performance (Ball 2010) and that monitoring offers diminishing returns instead of adding benefits (Bradbury 2019a).

Existing empirical evidence mainly derives insights from laboratory experiments (e.g., Falk and Kosfeld 2006; Dickinson and Villeval 2008) or call centres (e.g., Nagin et al. 2002; Jeske and Santuzzi 2015). While these studies are valuable, their findings may not be generally applicable to all workplaces. Similar to Mills (2017) and Bradburry (2019b), the current study uses sports as a testing ground and specifically analyses elite referees and their response to a new monitoring system. Sports can provide an ideal setting for the analysis of human behaviour (Goff and Tollison 1990), thereby allowing conclusions to be drawn for other fields.

Using data from football, in this study, we examine professional football referees’ decisions in response to the introduction of a new control system. Not only in sports, but in a variety of contexts, economic and social actors rely on professional arbitrators to apply the rules impartially and without favouritism (Bryson et al. 2021). Despite high (external and internal) incentives, experience and intense training, however, such impartiality may be subject to external circumstances and social forces that lead to (unconsciously) biased and ultimately incorrect decisions. Even though extensive knowledge is needed to control such experts’ behaviour, doing so might be important for a variety of organizations (Bradbury 2019b).

In elite football, where the referee represents the impartial agent (Sutter and Kocher 2004) and the league- or country-specific organisational committee/association [e.g., Deutscher Fussball Bund e.V. (DFB), Federazione Italiana Giuoco Calcio (FIGC)] acts as the principal, one such well-documented bias is the home advantage that occurs as a result of referees’ decisions. Various works in the literature have concluded that referees are likely to be influenced by external factors such as crowd noise (Nevill et al. 2002) or social pressure (Pettersson-Lidbom and Priks 2010), leading them to give preferential treatment to the home team, manifesting particularly in stoppage time decisions at the end of a match (e.g., Sutter and Kocher 2004; Garicano et al. 2005; Dohmen 2008), in decisions to award penalty kicks (e.g., Sutter and Kocher 2004) and in the issuing of disciplinary sanctions (e.g., Dawson and Dobson 2010; Buraimo et al. 2010). Such favouritism and potentially incorrect decisions may not only have an enormous influence on a game’s final outcome, but they may also have serious economic impacts, such as financial losses if disadvantaged teams end up in lower positions in the league or with less prize money. Even though this bias might cancel out during a double round-robin competition, where each team plays the exact same number of matches at home and away against the same teams, Peeters and Van Ours (2021) conclude that the home advantage of some teams is permanently stronger compared to others and, thus, poses a relevant problem for the fairness of the competition. Moreover, Page and Page (2007), Boyko et al. (2007), and Dawson et al. (2007) show that different referees provide different levels of home advantage and that some referees are more susceptible to being influenced by a crowd than others. Considering that the factors that contribute to a higher relative home advantage could be systematically exploited, league administrators should limit the possible sources for such biases, where one possible solution is improving the referees’ consistency.

To contribute to greater fairness and also to compensate for and acknowledge referees’ limits of perception, several (technological) aids have progressively been introduced into elite football, VAR being the most recent (Oudejans et al. 2000; Kolbinger and Link 2016; Kolbinger and Lames 2017). VAR was introduced into the Laws of the Game in 2018 to support match officials by reviewing their decisions and monitoring how well they follow the rules by means of screening video footage for three game situations (goals, red card incidents and penalties) and one administrative incident (mistaken identity). In these cases, VAR intervenes after a referee makes a clear and manifest error of assessment in order to increase (objective) fairness and to assist in correct decision-making (Fédération Internationale de Football Association 2019; Lago-Peñas et al. 2019). In practice, once the VAR has reviewed the video material, the head referee is informed via headset about the content of the specific video sequence, and appropriate decisions or actions can then be taken.

The general impact of VAR on the game was analysed by Lago-Peñas et al. (2019). They conclude that the VAR system does not substantially modify the game in elite football but leads only to a decrease in the number of offside violations, fouls and yellow cards in the German Bundesliga and Italian Serie A, and an increase in the number of minutes added to the regular playing time at the end of the first half but not at the end of the second half. However, it remains unclear whether this technology increases the efficiency of referees’ decisions and has its intended effect on the fairness of the game. Taking on this gap, we investigate the VAR’s monitoring mechanism with respect to referees’ decisions and home bias. In particular, we study whether VAR has changed the decision-making behaviour of referees operating in the first Italian (Serie A) and German (Bundesliga) professional football leagues and, consequently, whether it has increased general fairness by decreasing referees’ home bias. Thereby, we implicitly examine whether the introduction of VAR has helped to overcome the inefficiencies of the referee selection system or whether this system was already relatively efficient before VAR.

To do so, we consider games with red cards and/or penalty kicks as well as further corresponding match-specific variables, such as added time, and we analyse how these areas of the game have changed with the introduction of VAR. Generally, if the referee selection and training process were inefficient, the introduction of VAR would lead to a decline in referees’ preferential treatment due to them being constantly monitored and, consequently, would result in a more game-oriented and fair distribution of penalty kicks, red cards and added time. (Even though added time is not directly checked by VAR, we would also expect that in the case of ex ante inefficiency, the introduction of VAR would impact referees’ decisions regarding allocated added time.) Additionally, we compare referees with different levels of experience and analyse whether there is a differentiation of VAR usage for referees with more or less experience. In the case of ex ante inefficiency, we would expect that VAR intervenes more often when the referee is new and, thus, inexperienced in the league.

The next section overviews the related concepts and studies, provides more details on the implementation and mechanism of VAR, and outlines the commonalities and special features of the top tier leagues in Germany and Italy. In Sect. 3, we describe our data and findings. Lastly, in Sect. 4, we conclude with a discussion and final remarks.

2 Background

2.1 Football referees and home advantage

Football referees serve as impartial guardians of the law and authority for the interpretation of the rules on the field. They can, however, exercise substantial discretionary power in the general subjectivity of their assessments, in particular regarding extra time, the imposition of penalties, the awarding of yellow or red cards and decisions on free kicks or offside violations (Sutter and Kocher 2004). Consequently, referees have considerable influence on the final result of a match. Considering that the football business has developed rapidly and has become faster and more athletic (e.g., Matheson 2003; Frick 2007), the challenges referees face have increased, as has their overall responsibility and their physical and mental requirements. To become an official in elite football, referees have to go through a long training and reviewing process. The number of evaluations a referee receives increases continuously with higher league membership and is based on predefined criteria, such as the application of rules, but also more vague criteria such as the personality of the referee. Thus, the highest divisions only accept referees who have prevailed over their competitors in a multi-year promotion competition and have sustainably proven their professional, personal and pedagogical qualifications (Noller 2016). Also, a referee’s monetary remuneration reflects their success in outperforming the other competitors and the demanding conditions referees face.Footnote 1

Despite this extensive scope of training, the high experience levels and the great monetary incentives, referees’ decisions are generally not perfectly in line with the interests of the superordinate association (e.g., FIFA), the principal. The literature on endogenous preference formation acknowledges that the social environment can affect individual behaviour (e.g., Rabin 1998; Bar‐Gill and Fershtman 2005). In the context of elite football, one of the decisive mechanisms or subconscious triggers for biased referee decisions is the effect and social pressure of the home team crowd support (Sutter and Kocher 2004; Dawson et al. 2007; Moskowitz and Wertheim 2011); the crowd not only cheers and pushes home players to peak performance, as evidenced by players having significantly higher testosterone levels when playing at home (Neave and Wolfson 2003), but the noise created by the crowd can also impact referees’ decisions in favour for the home team. This so-called home bias, meaning that the home teams win more often than away teams (Sutter and Kocher 2004), has been extensively studied across different sports and various leagues (e.g., Sutter and Kocher 2004; Garicano et al. 2005; Dohmen 2008; Pettersson-Lidbom and Priks 2010; Buraimo et al. 2010, 2012). For instance, Pettersson-Lidbom and Priks (2010) investigated football matches in Italy, where the public was not allowed to attend as a consequence of hooligan violence. Their findings reveal that players of the away team are punished more severely when matches are played in front of spectators than when matches are played in empty stadiums. In another study on home bias, Buraimo et al. (2012) determined that home teams receive fewer red cards than away teams, and a study by Nevill et al. (2002) showed that referees tend to penalise the home team less often than the away team for the same foul. Sutter and Kocher (2004) also show that the away team is awarded a regular penalty less frequently than the home team. Based on data from twelve consecutive German Bundesliga seasons, Dohmen (2008) found that the home teams are systematically given an advantage in close matches due to social forces. They showed that the awarded added time in the second half of a football match is substantially longer if the home team is one goal behind compared to situations in which the home team is leading by one goal. Hence, home teams are afforded more time to reverse an inferior position than away teams.

Altogether, these forms of malpractice and divergent interests by economic actors belong to the area of agency theory (Lucey and Power 2009); a possible explanation may be that when making decisions on the pitch, referees weigh not only material payoffs but also social payoffsFootnote 2 (social approvals or sanctions) (Dohmen 2008). According to Nevill et al. (2002), referees seem to avoid displeasing the crowd; rather than penalising away teams, the dominant effect of crowd noise would appear to influence qualified referees to impose less disciplinary sanctions on home players. In line with this, Balmer et al. (2007) claim that crowd noise is associated with higher levels of anxiety and mental efforts for referees; therefore, referees tend to cope with such situations with more popular decisions for the home team. In addition, in the case of uncertainty about a foul, referees may tend to give equal weight to their own visual information and the reactions of the crowd in order to reduce complexity, but this may cause referees to be misled (Sutter and Kocher 2004). This combination of own visual information and reaction of the crowd not only reveals the referee’s decision trade-off, but it also shows the diverging interests of principal and agent, which is further reinforced by the referee’s high amount of discretionary power. This trade-off is further emphasized by recent studies that found that during ghost games (i.e., games without spectators) referees treated home teams less favourably (e.g., Dilger and Vischer 2020; Endrich and Gesche 2020; Tilp and Thaller 2020; Bryson et al. 2021).

As ghost games are not the long-term solution, an adequate tool to mitigate or at least reduce these referees’ biases is additional reviewing and, thus, monitoring (Parsons et al. 2011; Pope et al. 2018). To balance football referees’ increasing duties, recognise their boundaries of perceptions and ensure fair play, several supporting and monitoring instruments have been introduced in recent years. Examples are headsets as continuous communication tools, goal-line technology, vanishing spray (Kolbinger and Link 2016) and, most recently, VAR.

2.2 The implementation of VAR

After an extensive test phase starting in 2012/2013 in the Netherlands, VAR technology was implemented into the FIFA Laws of the Game in 2018 (Fédération Internationale de Football Association 2019). See Appendix A for an overview of the introduction of VAR across different countries and main tournaments.

Since the introduction, VARs have monitored the main referee’s application and execution of the rules, thereby aiming to increase (objective) fairness and assisting in decision-making with respect to the following four situations: goals, penalty kicks, (straight) red card incidents and administrative incidents (mistaken identity). During a game, assistants constantly screen video material and, if needed, inform the main referee about a doubtful situation. The match official can then either accept the information of VAR and make his decision accordingly or can review the video sequences on a specially installed screen on the side of the field. Still, the final decision is the head referee’s responsibility. Overall, VAR can be subsumed as a technological innovation in controlling referees’ decision-making processes (i.e., monitoring), that combines a human (i.e., the video referee) and a technological component (i.e., the video and audio system) (Cid and García 2020).

2.3 VAR as a monitoring tool

While critics of VAR claim it disturbs the flow of the game, it does provide support for the referee (Nlandu 2012; Svantesson 2014). In situations where intervention is allowed, VAR may be able to eliminate a referee’s biases or perceptual flaws. As head referees’ decisions are immediately validated by VAR, it can be expected that (i) with the introduction of VAR, the number of VAR-applicable decisions in favour of the home team will decrease (compared to before VAR implementation), and (ii) that the allocation of penalty kicks and red cards awarded to the home and visiting team as well as the allocation of added time, even if the added time is not directly checked by the VAR, will become fairer and less biased.

It should be noted, however, that both penalty kicks and red cards are rare and critical situations, but their complexity varies. Situations that result in a red card are usually clear by nature, as they require a player to commit a gross, unsportsmanlike foul against an opponent. In most red-card cases, only two players are involved, and they are in the referee’s direct field of vision. In turn, increased monitoring due to VAR is likely to have greater consequences for situations resulting in penalty kicks, which generally occur more frequently but are sparked by less obvious incidents. Penalty kicks often result from a complex, obscure game situation in which many players are involved in a dynamic environment. This includes, for example, penalties resulting from a hidden handball or an undetected foul play after a corner kick. Although this applies to situations that are crucial to the game and are monitored by the VAR, many minor decisions, such as usual fouls, yellow card incidents or the determination of the added time, are also at the discretion of the referee and can, consequently, still be part of a coping strategy to counteract crowd pressure and receive social payoffs in terms of social approval, which is contrary to the principal’s interest.

2.4 League-specific characteristics

Both German and Italian football associations introduced the VAR system to their highest leagues during the 2017/2018 season. The DFB, however, declared the first year as a 1-year test phase until it was officially introduced in the Bundesliga at the beginning of the 2018/2019 season (DFB 2019). According to a recently published interim report from the DFB, VAR was able to prevent 82 wrong decisions in the 2018/2019 season and 64 wrong decisions in the previous season in the Bundesliga (DFB 2019). Similar results were published by the FIGC and Lega Serie A (FIGC 2019). Altogether, both reports display a massive fall in the number of refereeing errors.

While in Germany and Italy, football is historically seen as the most popular sport in the country, there are a few structural differences between the two leagues, as summarised in Table 1. Differences are especially obvious with respect to the number of active referees per season and the average number of spectators in a stadium. While spectator numbers in the German Bundesliga are constantly high, the Italian Serie A has for many years been struggling with fewer spectators (SPOX 2016).

Table 1 Main facts about German Bundesliga and Italian Serie A

3 Data and results

3.1 Data and descriptives

We constructed a dataset including all games in the German Bundesliga and Italian Serie A with situations of given red cards and awarded penalty kicks two seasons before (2015/2016 and 2016/2017) and two seasons after (2017/2018 and 2018/2019) the introduction of VAR. We retrieved our main data from whoscored.com and added specific information on the matches from transfermarkt.de, dfb.de and weltfussball.de. In particular, we recorded whether the penalty decision was awarded for the home or the away team, the number of spectators, stadium capacity, overtime, the referee’s experience, the recorded winner, and other match-related structural characteristics. In addition to the absolute number of spectators and stadium capacity, we calculated capacity utilisation per match in order to draw conclusions about possible crowd noise effects (Dawson 2014). Moreover, we collected information on referees and considered a referee’s experience using the amount of whistled top tier matches based on the start of the ongoing season and not on a match-by-match basis. Lastly, we added information on match-specific average odds from wettportal.comFootnote 3 as a proxy for the current relative playing strength of the respective team (Dawson 2014). The difference in odds between home and away team signals the predicted intensity of competition in the match. The smaller the difference, the more competitive the match is expected to be. Lastly, we validated all VAR decisions by examining video recordings via youtube.de of the respective situations. In total, our sample contains data on game-decisive situations with red cards and penalty kicks from 2,744 matches of the German Bundesliga and Italian Serie A.

Descriptive statistics are presented in Table 2. In line with the previously described league- and game-specific differences, the total number of red cards and penalty kicks in the Italian Serie A was higher than in the German Bundesliga. Regarding other factors such as added time, the referee’s experience, home team winners and betting odds difference, no substantial differences are observed. Additionally, we observe a generally higher number of critical VAR incidents in the Italian Serie A compared to the German Bundesliga, which could be a potential consequence of different rule interpretation or a generally more rugged way of playing.

Table 2 Descriptives of our sample

In Figs. 1 and 2, we illustrate the total number of penalty kicks and red cards for the German Bundesliga and Italian Serie A over time. The implementation of VAR is marked in red, and we differentiate between incidents with and without VAR in 17/18 and 18/19. Both the absolute number of penalty kicks and red cards were practically constant over time (2012/2013–2018/2019), with a mean of 87 penalty kicks for the German Bundesliga and 124 for the Italian Serie A per season. The same applies to red cards, with on average 51 red cards per season for the German Bundesliga and 107 red cards per season for the Italian Serie A. After the introduction of VAR, the total number of initially given penalty kicks decreased by more than 25% and the number of given red cards decreased by more than 30%. The graphs indicate that VAR influences the decision-making behaviour of referees: They demand fewer penalties and red cards directly, but the VAR system, as the newly introduced monitoring tool, intervenes such that the total number of penalty kicks and red cards remains constant over time.

Fig. 1
figure 1

Total number of penalty kicks over time (season 12/13–18/19). The figure shows the consolidated figures for penalty kicks and red cards in the German Bundesliga and Italian Serie A per season. The average is 211 penalty kicks

Fig. 2
figure 2

Total number of red cards over time (season 12/13–18/19). The figure shows the consolidated figures for penalty kicks and red cards in the German Bundesliga and Italian Serie A per season. The average is 158 red cards per season

3.2 Home bias in awarding extra time (before and after VAR introduction)

To gain further insights into these dynamics and study the impact of increased monitoring on home bias, the empirical analysis started with an investigation of added time decisions. We considered games with red cards or penalty kicks before and after the introduction of VAR and expected that referees, as a result of social payoffs like social rewards from the home crowd, would award more added time when the home team is behind (Dohmen 2008). However, as a result of increased monitoring via VAR, although added time is not directly checked by the VAR, we expected this bias to be smaller or even eliminated after VAR was introduced.

Figure 3 reports the average added time (in seconds) in the second half, conditional on the score difference of all our considered matches. The inverted U-shape of this relationship is in accordance with the findings of Sutter and Kocher (2004), Dohmen (2008) as well as Rocha et al. (2013) and indicates that referees systematically reduce added time when score differences are large. The greatest amount of added time was awarded in close matches when the score difference was 1. In Fig. 4, we plotted this relationship separately for two seasons before and two seasons after the introduction of the VAR system. The simple comparison of the two charts demonstrates an increase in the average added time and underlines our suggestion that the VAR system changed the dynamics of the game.

Fig. 3
figure 3

Added time and score margins in the German Bundesliga and Italian Serie A (2015/16–2018/2019)

Fig. 4
figure 4

Added time and score margins by VAR introduction. 0 denotes seasons 2015/2016 and 2016/2017, 1 denotes seasons 2017/2018 and 2018/2019

For more detailed insight, we concentrated on close games (games with a score margin of one), in which a red card or penalty kick occurred before and after VAR introduction. Following the standards in the relevant literature, the so-called home bias appears when referees systematically reduce added time in tight matches when the home team has a small score advantage. Estimates from ordinary least squares regressions are reported in Table 3 separately for both Italian Serie A and German Bundesliga. The dependent variable in all models is the added time in seconds at the end of the second half. The home bias in awarded added time is measured by the coefficient score difference in favour of the home team.

Table 3 Added time at the end of a match in close games with red cards or penalty kicks before and after the introduction of the VAR system

Columns 1–3 of Table 3’s A show that before VAR introduction, referees in the Italian Serie A systematically reduced the added time when a home team had a small score advantage. Even though the significance of our coefficient of interest is quite small, this means that when the home team was winning by one goal, the referee gave on average 8.6 s less of added time than when the home team was losing by one goal. However, after the implementation of VAR, this systematic bias no longer exists (Columns 4–6). With respect to the German Bundesliga, we cannot prove the existence of a home bias in awarded added time both before and after VAR introduction, and, thus, cannot provide insights into systematic changes after the implementation of VAR. To further shed light on this topic of VAR interventions and home bias, we analysed red card and penalty kicks separately before and after VAR introduction.

3.3 Home bias in awarding penalty kicks and red cards

Figure 5 separately shows the distribution of penalty kicks and red cards for away and home teams two seasons before and two seasons after the introduction of VAR. (See Appendices B and C for the specific numbers for the highest league in both countries.) We were specifically interested in the distribution patterns of red cards and penalty kicks between home and away teams and assumed that decisions which are given after or proved by a VAR intervention are free of human biases.

Fig. 5
figure 5

VAR implementation and home bias for penalty kicks and red cards. A Penalty kicks. B Red cards. The abbreviation md. indicates the sum of matchdays in the German Bundesliga and the Italian Serie A. H denotes the home team being awarded a penalty kick or receiving a red card, and A denotes those for the away team

As illustrated in A of Fig. 5, in all four considered seasons, home teams were awarded relatively more penalty kicks than away teams. The introduction of VAR did not change this distribution, and the VAR system intervened with comparable frequency for both the home and away teams. From this even distribution of VAR interventions and the fact that the home team was consistently awarded more penalty kicks, it seems plausible that the home team gains an advantage from playing in front of the home crowd, but not as a result of referee favouritism.

With respect to red card incidents, B (Fig. 5) shows that almost 60% of total red cards per season were given to the away team. During the first season in which VAR was active, it intervened disproportionately, often in decisions against the home team. This may point towards a home bias, as the referee’s initial decision was corrected advantageously for the away team. Thus, VAR seems to mitigate the effect and the home crowd’s social pressure on a referee’s decisions. One could interpret VAR as a mechanical, impartial device that absorbs the social pressure exerted by the crowd on the referee; by definition, this device cannot be personally blamed by any crowd of spectators.

However, this pattern could not be confirmed in the second season with VAR; instead, the total VAR interventions with red cards drastically decreased (from 31 interventions in 17/18 to 9 interventions in 18/19).

3.4 VAR intervention and referees’ experience

Lastly, we investigated whether a referee’s experience level was related to VAR usage. Results of a separate analysis for referees’ different experience levels are given in Table 4. High (low) experience is defined as the number of whistled national matches above (below) the sample mean. The results of a t-test suggest that there are no systematic differences regarding general VAR usage or VAR usage in the last quarter between experienced and inexperienced referees. Further, we observe no significant differences in admitted added time nor any impact of a sold-out stadium.

Table 4 Game characteristics by referee’s experience

4 Discussion and conclusion

Using football data, we analyse the impact of a monitoring system for experts. In the context of football, the introduction of the VAR system, a tool that provides transparency and makes referees aware of critical situations, provides an ideal setting for investigating real-time monitoring in an already highly optimised principal-agent framework, thereby offering insights into this new technology.

Overall, our analysis of games with red cards or penalty kicks during four seasons showed that (i) referees changed their behaviour following the introduction of VAR; (ii) in the case of the Italian Serie A, in the case of the Italian Serie A, referees likely showed a preferential treatment of the home team, which manifested in added time decisions before the introduction of VAR, but this bias was not present in the two seasons after the introduction of the new system; (iii) home teams were awarded more penalty kicks and fewer red cards, but, concerning penalties, VAR intervened equally often for both the home and away teams; (iv) no differences were found when comparing the frequency of VAR interventions between experienced and inexperienced referees. With the introduction of VAR, the number of directly called penalty kicks and awarded red cards decreased, but the total numbers of penalty kicks and red cards over a season stayed about the same; this is a first indication that the new system is effective. The fact that referees initially called fewer penalty kicks and awarded fewer red cards after VAR introduction may indicate that referees were somewhat uncertain of their decisions, and they wanted to benefit from the new system rather than be corrected afterward. The transparency achieved by VAR makes agent performance even more public in real time. Therefore, the risk that is transferred to the agent increases, in that monitoring and, hence, public scrutiny may affect their reputation (negatively or positively). While fewer mistakes may lead to a better reputation for the referee, too many corrections by VAR might have the opposite effect. However, in the highest German and Italian leagues, the VAR system only intervened in about every second game, making reputational losses rather unlikely.

By focusing on the referee’s response and evaluating decisions where VAR could have intervened, we eliminated potential favouritism or bias induced by external factors on the head referee. In line with existing literature, the total distribution of penalty kicks and red cards over the whole period of consideration pointed towards an existing home bias (e.g., Sutter and Kocher 2004). However, the examination of penalty kicks showed that the VAR system intervened to a similar extent for both home and away teams, implying that a home bias exists for reasons other than the referee (e.g., playing in familiar surroundings). We suggest that this comparable higher number of penalty kicks for the home team results from a general higher game intensity and dominance of this team. It seems that playing in the home stadium in front of local fans results in a highly offense-oriented and dominant playing strategy and, therefore, creates more critical situations in the away team’s penalty box. Considering red card incidents, no clear pattern can be observed. In the 17/18 season, it seems that more given red cards after VAR interventions might be an indication that referees are more likely to make decisions that are detrimental to the home team, but the use of VAR seems to be instrumental in correcting this. However, this trend cannot be seen in the 18/19 season; therefore, it might rather be a result of the very first experience with the system, and we cannot derive concrete statements for the very few red card events. Furthermore, the fact that no differences existed between more and less experienced referees supports the overall purpose of the VAR system: proving and increasing overall fairness. We hypothesised that VAR intervenes more often when the referee is new and, thus, inexperienced in the league. However, the insignificant results suggest that the promotion and selection process for referees at the highest leagues works well. This is in line with the study by Picazo-Tadeo et al. (2017), showing that different levels of referee experience are not, by themselves, able to explain home bias in terms of fouls awarded.

The rather small effects and lack of differences suggest that VAR implementation has purposes other than classic agent monitoring. Due to the reputation associated with the referee’s on-field performance as well as no systematic differences between experienced and inexperienced referees regarding the use of VAR, it can be assumed that VAR strengthens the general role of the referee and enables direct feedback for the spectators. VAR applies in both directions, for the principal and the agent. Due to the system, the principal retains strict control but also protection over his agent. Further, VAR gives the principal the opportunity to acquit him-/herself of possible bribery, of deliberately appointing certain referees for certain matches and of general manipulation. The referee has the opportunity to prove him-/herself as well as substantially lessen the chances that he/she will exert potentially wrong decisions.

Regarding limitations, it should be noted that we deliberately focused on the decisive situations of penalty kicks and red cards. These are only a fraction of the fouls and related incidents that actually take place, and they occur at constant levels overall but with substantial differences in terms of VAR decisions. Also, the monitoring in elite football is already very robust. In recent years, various performance improvements have been implemented, such as the addition of a fourth official. Furthermore, we considered only two seasons directly after the official implementation of VAR; more experience with VAR might increase its general acceptance, and more mainstream use might provide a greater database to analyse. Additionally, we did not consider seasonal changes for referees regarding the different levels of sanctioning certain fouls. Moreover, we considered whole matches and single VAR interventions as the unit of observation, but Buraimo et al. (2010, 2012) and Del Corral et al. (2010) proposed that match minutes might be more suitable for studying specific situations during a match, as these situations are likely to depend on each other, such that a referee’s decision is likely to be based on the decisions he has made earlier. In addition, we did not separately examine “double punishments” in the form of a penalty kick and a red card. In our data, such situations were considered as two separate incidents. Additionally, and for reasons of simplification, we did not distinguish between a combined yellow–red card and a straight red card. Future research should consider these points and analyse the surveillance strategy of the VAR system from different angles; future studies might also include qualitative analyses of referees, managers, coaches, players, and fans. Also, future studies could extend the considered time frame, include other leagues and record further game-specific characteristics. Lastly, it is important to emphasise that we investigated a sample of highly trained and incentivised agents operating in an already well-optimised field. While this might be comparable to other professions that retain a high level of autonomy and are able to determine the success or failure from their decision (e.g., judges or experts in firms), applying monitoring tools to a more heterogeneous sample in a different field should result in different responses from agents (Lazear 2000).

In conclusion, our results demonstrate that monitoring technology adds value for professional football referees, and, thus, experts, but not necessarily by increasing the efficiency of their decisions; instead, it underlines their already highly efficient training and selection processes.