1 Introduction

Why do similar workers get paid differently? What role do non-wage attributes of different firms or jobs play? As Smith (1776) argued, wage differentials arise to compensate workers for undesirable non-monetary characteristics of jobs. According to labor market theory (Borjas 2010; Rosen 1986), compensating wage differentials (CWD) correspond to the higher income that must be offered to workers to motivate them to accept a job with undesirable characteristics. Similarly, these differentials are negative in the case of desirable jobs; that is, a worker would be willing to join a firm accepting a lower wage (relative to other similar jobs) in exchange for certain desirable non-monetary compensations.

The CWD theory has been tested in many fields. First, several studies estimated the wage differences across jobs with different probabilities of workplace injuries and found a positive relation between wages and unsafe work conditions (Biddle and Zarkin 1988; Duncan and Holmlund 1983; Viscusi and Moore 1987; Rao et al. 2003; Gertler et al. 2005). However, Dorsey and Walzer (1983) found that significant wage premiums are paid for an increased probability or severity of nonfatal injury for nonunion workers only, while the evidence on compensating differentials is mixed for unionized employees. More recently, Guardado and Ziebarth (2019) found that accident risks are positively correlated with wages but only if the risk originates from the firm or its technology. Conversely, if the risk is due to workers’ lower safety-related productivity, then it is negatively associated with their wages.

Second, a strand of research has tested the CWD theory related to employment benefits like health or unemployment insurance. According to the theory, workers must pay for health insurance provided by firms through lower wages. This prediction is confirmed by Jensen and Morrisey (2001), Olson (2002) and Miller (2004), whereas other studies have found a positive or a less clear correlation, which arises due to the fact that workers with and without health insurance differ considerably (for a review, see e.g., Currie and Madrian 1999). Moreover, the theory also predicts that the higher the risk of layoff in certain industries, the higher the wages (Moretti 2000; Averett et al. 2005). Carpenter et al. (2017) focused on firms with varying degrees of disamenity (income risks) and found that riskier firms must pay significantly higher wages to attract workers, whereas mobile workers sort into firms according to their attitudes toward risk, and the compensating differential shrinks.

We contribute to a third strand of literature by analyzing the role played by local amenities. In particular, we test the idea that workers are monetarily compensated for unpleasant or unattractive non-monetary job features. We exploit local differences in quality of life indicator, crime rate, population, and development.

Some empirical evidence confirms the CWD theory by highlighting a negative impact of desirable location characteristics on wages. Blomquist et al. (1988) and Roback (1982) indicated that compensation for location-specific and non-traded amenities takes place in the labor market, supporting the idea that wage differentials compensate for geographical differences in location amenities, such as higher crime rates. Indeed, while Braakmann (2009) found that wages are unrelated to changes in regional crime rates in Germany, further studies confirmed the positive wage compensation for the higher risk due to higher crime rates (see e.g., Smith Kelly 2011; Iriarte 2017). On the contrary, other articles did not provide any evidence of the theory’s predictions, yielding non-significant estimates or coefficients with the wrong sign (see e.g., Brown 1980; Bonhomme and Jolivet 2009).

All in all, the overview of previous studies on CWD reveals ambiguous results that could be due to biased estimates stemming from several statistical issues. For instance, unobserved worker heterogeneity related to labor market attachment, ability, motivation, skills; unobserved firm heterogeneity; and error measurements in reporting workers’ productivity or firms’ non-monetary characteristics. These sources of unobserved heterogeneity make it difficult to credibly identify the compensating wage differentials and may bias the results. Indeed, reliable measures of workers’ productivity are necessary for confirming the theory’s predictions but are rarely observed (Hwang et al. 1992). Moreover, high-ability workers are likely to earn higher wages and, at the same time, may also have better working conditions (ability bias). Therefore, empirical analysis must track their earnings over time as they change jobs and also needs to control for individual fixed effects (Brown 1980; Duncan and Holmlund 1983; Garen 1988).

To address these issues, in our analysis we exploit sports data from the universe of football players in the Italian Serie A. While this labor market has received less empirical attention so far, it offers several features to overcome the methodological issues highlighted above. Many indicators about both worker and firm characteristics and productivity are widely available and observed with high frequency. Moreover, the possibility of employer-employee matching and high turnover rates over time allow us to control for unobserved worker heterogeneity while exploiting variations in firms and location characteristics. We reconstructed the career histories of all players in the Italian top league for ten years, spanning from the 2010–11 season to the 2019–20 season. Our focus is on the signing of new contracts resulting from wage renegotiations within the same club or due to transfers to another team.

We also take advantage of observed measures of their productivity and skills at the individual level, as well as make use of measures of team quality to control for firms’ heterogeneity and retrieve information on local amenities at the city and province level. Thanks to the availability of these data, we are able to test the theory with measures of workers’ skills and characteristics, team quality, and local amenities at hand. As pointed out by Kahn (2000), sports data are much more detailed than usual microdata and allow us to obtain a complete dataset of employee-employer matches, career history, and the productivity of each worker.

As a matter of fact, there is a growing body of economic research using sports data to explain fundamental economic mechanisms (Bar-Eli et al. 2020). In testing CWD evidence, Michaelides (2010) used data on professional basketball players from the NBA. His findings strongly support the theory’s predictions, highlighting that location amenities and team characteristics have a substantial effect on players’ wages, whereas not accounting for unobserved player heterogeneity distorts the quality of the results. Focusing on the National Football League, Dole and Kassis (2010) highlighted that playing on artificial turf increases the risk of injury and, therefore, players receive wage premiums. Using data on baseball players in MLB, Link and Yosifov (2012) focused instead on the effect of contract length on wages (without bonus). They suggest that free agent position players appear willing to trade monetary returns for performance for the security of a longer guaranteed contract, and this matters, in particular, for older players. According to Zimmerfaust (2018), the expected productivity of the team a worker will join produces the main significant and negative compensating wage differential. This effect is particularly driven by younger workers who trade 25% of their wages to join a team with higher expected productivity and, therefore, more human capital accumulation.

A further contribution of this paper is investigating the role of compensating wage differentials on workers in the top tail of the earnings distribution with respect to local amenities. This relationship is relevant both for unveiling the dynamics underlying wage determination and for its geographical spillovers in terms of local development and entertainment, for example. Italy represents an ideal setting for at least two reasons. First, Serie A is one of the top five most followed football leagues in Europe, and football players represent an important share of the top earners, as they constitute approximately one-fifth of the top 0.01% of the earners’ distribution in Italy (Franzini et al. 2016). Second, Italy exhibits a large geographical heterogeneity in regional development, local amenities, and quality of life. Apart from the well-known North-South divide, differences with respect to several economic and social issues emerge more or less everywhere in the country (Felice 2018).

We find that compensating wage differentials do emerge in superstars’ labor markets. Our findings highlight the role of location amenities in shaping such differentials. Football players require compensation for undesirable local characteristics such as crime rate and population density. An additional contribution of our paper is investigating how wage differentials relate to the riskiness of the job in terms of low performance. We do find evidence of this. Newly promoted teams must pay a wage premium to players; we argue that this is due to the higher risk of relegation faced by these teams. Additionally, we also provide evidence “beyond the mean”. By applying the Unconditional Quantile Regression approach proposed by Firpo et al. (2009), we show that these estimates vary in intensity across the wage distribution. Workers in the right tail exert higher market power and receive higher compensations. In contrast, there is scant evidence of such compensating differentials emerging on the left side of the wage distribution.

The remainder of the paper is organized as follows. In Sect. 2 we present the data used in our analysis and provide some descriptive statistics. Section 3 explains our empirical strategy. Section 4 shows the results. Section 5 summarizes and concludes.

2 Data and descriptive analysis

We assembled an original dataset recording longitudinal information on professional football players in Italian Serie A, covering a 10-season period from 2010–2011 to 2019–2020. First, information on players’ yearly wages is extracted from the annual report published by the most influential Italian sport newspaper, La Gazzetta dello Sport, at the beginning of each football season.Footnote 1. In order to provide additional evidence about the accuracy of this information, in Fig. 2, we plot the total salary bills reported by football clubs in the official annual balance sheets against the aggregated individual salaries provided by La Gazzetta dello Sport for all the clubs in our sample. The figure displays a very strong correlation between the two, with most points very close to the best-fit red line. Dispersion around the line might depend on variations in taxation, such as the so-called growth decree that allowed Italian clubs to attract talents from abroad (wage bills in the annual sheet are gross figures), mid-season transfers, performance-related bonuses, or penalties due to disciplinary sanctions, while Gazzetta’s data are recorded net of taxes and without taking into account performance-related bonuses.Footnote 2 Then, wage data are matched with information on players’ characteristics and performance in the previous season collected from whoscored.com and transfermarkt.com.

We have an unbalanced panel dataset; the relegation/promotion system between Serie A and Serie B and the transfer market across national and international teams generate a turnover of players in the league under analysis.Footnote 3 This turnover involves heterogeneous types of players, both less talented players traded with teams playing in minor leagues and more talented players traded with top European clubs, so it mitigates concerns about selective attrition (Carrieri et al. 2020; Principe and van Ours 2022). Furthermore, because performances are measured differently for goalkeepers compared to the outfield players, we excluded the former from the analysis, following the standard approach in this literature (Lucifora and Simmons 2003; Carrieri et al. 2018; Principe and van Ours 2022). The final sample included 2788 player-season observations.

Table 1 reports descriptive statistics. It shows the average log wages and the means of some players’ characteristics for (i) the overall sample; (ii) distinguishing by age; (iii) by position on the pitch (i.e., defender, midfielder, and forward); (iv) by splitting the sample according to their origins (i.e., Italian, European, and extra-European). As the main productivity measure, we use the yearly-level algorithm-based rating provided by whoscored.com.Footnote 4 We note that younger players are employed (from the first minute) on average half as much as the older ones in Serie A. Moreover, a noteworthy point in panel (d) is that Italian players are on average older and less paid than foreign players. Further descriptive statistics are provided in Table 4 in the “Appendix”.

Table 1 Summary statistics

Figure 1 displays kernel densities of wages for the full sample as well as for the same sub-groups of players as in Table 1, while the same distributions, in logarithmic scale, are displayed in Fig. 3. All the distributions are positively skewed with a long upper tail. This supports the idea that a restricted number of players earns huge wages compared to the remainder of players in the distribution, and it is consistent with the economic theories of ‘superstars’ (Rosen 1981; Adler 1985). Moreover, looking at the sub-categories, the extent to which younger players earn less than older ones emerges, while forwards earn a higher wage premium compared to players in other positions.

Fig. 1
figure 1

Kernel densities of wages. Notes: Kernel densities based on 2788 player-season observations. Wages are reported in million of Euro

For the purpose of our analysis, we also collected information concerning both team characteristics and local amenities from several sources of data. First, regarding team characteristics, we retrieved the ELO rating from elofootball.com, the lagged ranking from each Serie A final rank, and a dummy equal to 1 if a team had been promoted from Serie B at the end of the previous season. The ELO rating is an estimation of a club’s strength based on its past results and history, by using a weighting for the kind of match, an adjustment for the home team advantage, and an adjustment for goal difference in the match result. The yearly-level rating considers all official international matches for which results are available and tends to converge on a team’s true strength relative to its competitors after about 30 matches. Second, we collected information about local amenities at the municipality/province level from the Italian National Institute of Statistics (ISTAT). In particular, we obtained annual data on the total crime rate per 1000 inhabitants and the city population. Additionally, an index of general quality of life is retrieved from the “Indagine sulla qualità della vita” conducted by Sole24Ore.Footnote 5 This index provides a comprehensive and insightful assessment of well-being. Drawing from a diverse range of factors, including economic stability, recreational services, education, quality of infrastructure and public transport, and environmental quality, the indicator offers a nuanced understanding of individuals’ and communities’ overall living conditions. Its holistic approach makes it a valuable tool for gaging the multifaceted aspects that contribute to a high quality of life. Last, we also added a dummy for clubs located in Southern-Islands regions. Figure 4 shows the map of the locations included in our sample, while Table 4 reports definitions and summary statistics for all the variables. Finally, Fig. 5 plots some preliminary correlations between wages and location amenities.

3 Estimation strategy

We estimate the worker’s hedonic wage equation by including a large set of measures of worker’s productivity and job characteristics at team and city/county levels. The equation is:

$$\begin{aligned} \hbox {lnWage}_{it} = \beta P_{it-1} + \gamma X_{it-1} + \delta F_{it-1} + \alpha A_{it-1} + \theta _i + \phi _t + \varepsilon _{it} \end{aligned}$$
(1)

where \(\hbox {lnW}_{it}\) is the log wage of player i in season t. P is a set of performance related variables that allows us to control for time-varying heterogeneity through measures of productivity and the total number of matches played from the first minute in \(t-1\)Footnote 6; X is a set of player characteristics such as age, age square, presences in national team both during the previous season and total. A second set of regressors F concerns firm attributes related both to team history and success (ELO rating) and lagged ranking and promotion. These measures of team quality enable us to control for firms’ heterogeneity. Furthermore, A consists of a set of variables concerning local amenities. \(\beta \), \(\delta \), \(\gamma \) and \(\alpha \) are vectors of estimated parameters and \(\varepsilon \) is the error term. We use individual fixed effects \(\theta \) to take into account both unobserved differences across workers (e.g., talent, attitudes, motivation) and time-invariant characteristics such as nationality or racial status, and season fixed effects \(\phi \). The use of individual fixed effects and firm characteristics allows us to avoid error measurements on workers’ productivity due to spillovers from teammates’ effort and complementarities among teammates’ skillsets (Hamilton et al. 2003; Lazear 1999; Falk and Ichino 2006; Mas and Moretti 2009). It is worth noticing that players’ wages are typically set at the time of contract signing with a new team or modified in case of renegotiation within the same club. However, under multi-year contracts, the wage in a given year is not directly influenced by the player’s performance in the previous season but is predetermined in advance and independent of other factors. Because of this, we restrict our estimation sample by observing the player’s wage only at the start of his contract or in case of renegotiation, and regress it on the covariates from the previous season. Our final sample consists of 2023 observations from 607 players corresponding to new contract signings resulting from changes in players’ wages within the same club or due to a transfer to another team.

Moreover, we are interested in investigating whether the effect of the selected variables changes along the wage distribution. In fact, one may speculate that compensating mechanisms for disamenities are likely to emerge in particular for those workers with significant bargaining power. Our approach is to estimate Eq. 1 using the unconditional quantile regression (UQR) proposed by Firpo et al. (2009). The key advantage of the UQR approach over other distributional methods, such as the conditional quantile regression (CQR) proposed by Koenker and Bassett Jr (1978), is that it allows us to analyze the relationship between covariates and the unconditional distribution of wages. This possibility occurs because the UQR method provides a linear approximation of the unconditional quantiles of the dependent variable. The law of iterated expectations can be applied to the quantile being approximated and used to estimate the marginal effect of a covariate through a simple regression of a function of the outcome variable, the Recentered Influence Function (RIF), on the covariates. The transformed outcome variable (the RIF) is defined pre-regression; thus, unlike in the case of CQR, including any control variables does not change the definition of the quantile (Porter 2015).

In our setting, the RIF of wages is estimated directly from the data by first computing the sample quantile q and then estimating the density of the distribution of wage at that quantile using kernel density methods. Then, for a given observed quantile, a RIF is generated and can take one of two values depending on whether the observation’s value of the outcome variable is less than or equal to the observed quantile. The RIF is defined as:

$$\begin{aligned} \hbox {RIF} (W;q_\tau , F_W)=q_\tau +\frac{(\tau -\mathbbm {1}[W\le q_\tau ])}{f_W (q_\tau )} \end{aligned}$$
(2)

where \(q_{\tau }\) is the value of the outcome variable W, measuring the log wage, at the quantile \(\tau \). \(F_W\) is the cumulative distribution function of W, while \(f_w(q_{\tau })\) is the estimated density of W at \(q_{\tau }\). The indicator function, \(\tau -\mathbbm {1}[W\le q_{\tau }]\), identifies whether the value of the outcome variable W for the individual is below \(q_{\tau }\).

The RIF is then used as a dependent variable in a fixed effects regression on the covariates defined in Eq. 1. In practice, this corresponds to estimating a rescaled linear probability model. Thus, the linearization allows the estimation of the marginal effect of a change in distribution of the covariates on the unconditional quantile of wages.

4 Results

4.1 Main findings

Table 2 reports fixed effect estimates of Eq. 1, using different specifications for the set of covariates included in the model. According to CWD theory, desirable characteristics will have a negative coefficient while undesirable job attributes will positively affect wages.

With respect to local amenities, we find that compensating wage differentials do exist in superstars’ labor market. In particular, crime rate and population are positively associated with wages. The point estimates for crime rates are consistently positive and statistically significant across all specifications, with values ranging from 0.008 to 0.009; this implies that a one-unit increase in the crime rate is associated with an approximate 0.8% to 0.9% increase in wages. This finding suggests a compensatory mechanism where players are remunerated with higher wages for being in areas with higher crime rates, potentially reflecting a risk premium. The point estimates for population of about 0.11 indicate that a 1% increase in population is associated with a 0.11% increase in wages. Interestingly, our findings regarding the population are in line with recent studies arguing that, controlling for unobserved worker heterogeneity by means of fixed effects techniques, workers earn significantly higher wages in more populated areas than in less populated areas (Hirsch et al. 2022; D’Costa and Overman 2014). The standard explanation for these findings is that agglomeration economies raise the level and growth in worker productivity in thick markets (Combes and Gobillon 2015). While this explanation seems to be incomplete for the particular labor market we analyze, our results suggest that, in the case of superstars, positive compensating differentials for bigger cities arise in order to compensate the players for the higher dis-utility of living in populated areas (i.e., traffic, poor air quality, lack of privacy) as well as the higher cost of living. Moreover, players earn a wage premium of about 9% to play for a team located in the South, albeit the effect is statistically significant only at 10% level. This is due to the well-known North–South divide in Italy, as regions in the South are historically less developed and offer less services, which may make them less desirable to move to.

Concerning team-level variables, it is important to notice that newly promoted teams need to pay a higher wage to attract players; this amounts to about 8–12% across all specifications. We argue that this might be related to the riskiness of the job. In fact, newly promoted teams face a higher risk of relegation and this might be converted in positive wage differentials in order to convince players to join them, and the specification in column (3) in Table 2 shows that the interaction between Southern regions and newly promoted clubs seems to exacerbate this effect. Moreover, in some cases, the pay structure of the player’s wage may vary over time according to a pre-agreed schedule of payment. To account for this scenario, in column (4), we further reduce the sample to analyze only those football players who moved to another team, observing them at the start of the new contract. The results confirm the presence of positive compensating wage differentials for crime rates and more populated cities.

Finally, performance measures and individual characteristics are positively associated with wages, as expected. In particular, an increase in the overall performance, proxied by the algorithm-based rating, accounts for about 18–24% in wage premium. In line with previous studies, the age profile of the player shows a nonlinear association with wage; the coefficients for age and age squared show a clear quadratic relationship, indicating that while age positively affects wages up to around 28.6 years, beyond this point the effect becomes negative. This reflects the typical career trajectory in sports, where players experience wage growth early in their careers, peak in their late 20 s, and then face a gradual decline as they age further. Finally, our estimates reveal that experienced players (i.e., those with more appearances in the national team), ceteris paribus, enjoy a higher wage compared to their counterparts.

Table 2 Full set of parameter estimates (OLS-FE)

Table 3 reports parameter estimates from the UQR at the selected percentiles. Results show that at the bottom of the distribution, the location amenities have no statistically significant effect on wages. This suggests the absence of any compensating mechanism. In this segment of the distribution, players exert less market power and thus less opportunities to be compensated for undesirable location characteristics. In contrast, the role of these amenities is exacerbated in the right-hand side of the distribution, where actual superstars lie. Here positive compensating wage differentials arise with respect to crime rate and population. Importantly, workers in the top quantiles also require compensation for employment’s riskiness; a newly promoted team pays a wage premium of about 17–24% to sign them.

Table 3 Parameter estimates (UQR)

4.2 Heterogeneity analysis

To check the presence of any heterogeneous impact in our findings, we performed a series of sub-group analyses. First, to establish if there are heterogeneous effects of local amenities in Italian Serie A between native players and not, we proceeded as follows: (i) we performed two separate analyses for the Italian and non-Italian samples; (ii) we further distinguished between European (including Italians) and non-European players. Results are presented in Table 5, columns (1) to (4), and show that the benchmark findings on local amenities are confirmed for both Italian and non-Italian players.

Second, it should be noted that different cohorts of workers may be attracted by different job attributes. For instance, younger players may be attracted by teams that guarantee more minutes played and visibility. At the same time, middle-aged players may be more interested in joining teams that play international cups while they are at the top of their careers. In contrast, older players may accept lower wages in exchange for longer contracts since they have lower career opportunities in the immediate future (Link and Yosifov 2012). To account for these features, we outlined regressions distinguishing by age. We report these findings in columns (5) to (7) of Table 5, showing that, while the benchmark results are mostly confirmed for middle-aged, local amenities are not statistically significant for the under 23 players and older players, with the exception of local population. Indeed, for this particular sub-samples the most important determinants of wages are players’ performances and their attendance from the first minute for the former, and team ranking for the latter.

Finally, we checked whether “distance from home" is a possible determinant of wage differentials. From Wikipedia, we collected information about the place of birth of all the Italian players included in our sample. Then, we measured the geographical distance in kilometers between player’s city of birth and club’s city and we created a dummy taking value 1 if the player was born in the same city of the club. We report parameter estimates graphically in Fig. 6. Albeit only significant at 10% level, these results are suggestive of a “distance from home" effect. Players require compensations for longer distances while are willing to pay to work in their own city.

5 Conclusions

In this paper, we provide evidence on the role of local amenities in shaping compensating wage differentials in labor markets populated by superstars. We assembled an original dataset by reconstructing the career histories of all the players in the Italian Serie A over a period of 10 years, utilizing observed measures of their wages, productivity, and skills. Additionally, we employed measures of team quality to control for firm heterogeneity and retrieved information on local amenities at the city and province levels.

Table 4 Variables definition, sources, and summary statistics

Our findings support the existence of compensating wage differentials in football players’ labor market. On average, workers receive a positive wage compensation to work in places with undesirable characteristics such as high crime rates and higher disutilities of living in more populated areas. An additional contribution of our paper is estimating compensating wage differentials concerning the riskiness of employment. We show that workers are compensated to work in firms with a higher probability of (sporting) failure. In a ‘beyond the mean’ analysis, we also demonstrate that these relationships vary in intensity across the wage distribution. Workers in the right tail exert higher market power and receive higher compensations. In contrast, there is scant evidence of such compensating differentials emerging on the left side of the wage distribution. These results may have broader implications beyond the specific labor market we studied. Although they may appear to have low external validity for common employees, for whom further amenities like costs of living, house prices, and public services may be more relevant, the main findings about football players can be easily extended to other workers with considerable bargaining power, such as CEOs and top management

Fig. 2
figure 2

Correlation between team wage bill and aggregated player wages. Notes: The figure illustrates the correlation between the team wage bill published in the official annual balance sheet and the aggregated team player wages provided by La Gazzetta dello Sport. The red line is the best-fit line

Fig. 3
figure 3

Kernel densities of (log) wages. Notes: Kernel densities based on 2788 player-season observations

Fig. 4
figure 4

Locations. Notes: The map shows the locations included in our sample. Teams from Southern regions and Islands are Bari, Benevento, Cagliari, Catania, Crotone, Lecce, Napoli, Palermo, and Pescara

Fig. 5
figure 5

Log wages and local amenities. Notes: The figure reports correlations between log wage and log crime rate (left panel), and log wage and log population density (right panel). The size of the circles indicates city population

These findings also hold implications for both local development in terms of the quality of entertainment and leagues’ competitive balance. In fact, if clubs located in more developed areas pay less to attract superstars, this enhances the quality of entertainment and widens the gap across different locations. Consumers in high-quality-of-life areas also benefit from the performance of better players compared to those living in less developed areas. On the other hand, league competitive balance is also impacted. Newly-promoted teams, especially those in areas with lower amenities, must offer higher wages to attract players. Thus, assuming clubs face similar budget constraints, this process leads to an unequal distribution of talent reflected in lower uncertainty of outcomes and, consequently, diminished levels of consumer welfare.