Abstract
Over the course of the twentieth century, American wages increased by a factor of about 100, while the wages of professional baseball players increased by a factor of 450, but that increase was neither smooth nor consistent. We use a unique and expansive dataset of salaries and performance variables of Major League Baseball pitchers that spans over 400 players and 60 years during the reserve clause era to identify factors that determine salaries and examine how the importance of various factors have changed over time. We employ a Markov regimeswitching regression model borrowed from the macroeconomics literature, which allows regression coefficients to switch exogenously between two or more values as time progresses. This method lets us identify changes in wage determination that may have occurred because of a change in the league’s competitiveness, a change in the relative bargaining power between players and teams, or other factors that may be unknown or unobservable. We find that even though Major League Baseball was a tightly controlled monopsony with the reserve clause, there was a significant shift in salary determination that lasted from the Great Depression until after World War II where players’ salaries were more highly linked to their recent performance.
This is a preview of subscription content, access via your institution.
Notes
 1.
See Haupert (2009) for a discussion of MLB wages during different labor regimes.
 2.
Section 3 has a more detailed discussion of this literature.
 3.
A timefixedeffects panel regression can also identify time variations in the average salary while accounting for other explanatory variables, but this method does not otherwise allow structure of the regression relationship to change. We chose a regimeswitching procedure instead to specifically capture changes in the relationship of the dependent variable (salary) and the explanatory variables (performance and experience) over time.
 4.
Seymour (1960), p. 85.
 5.
Failed competitors and their years of operation: Union Association 1884, American Association 1882–1891, Players League 1890, Federal League 1914–1915. The American League was formed as a competing league in 1901 and merged with the National League in 1903.
 6.
See Haupert (2009).
 7.
For a compelling history of these negotiations, see Miller (1991).
 8.
p. 915.
 9.
Scully (1974), p. 934 footnote.
 10.
p. 549.
 11.
p. 569.
 12.
With time series data, a Kalman filter can be used to estimate a model with timevarying regression coefficients, where the coefficients may evolve according to their own autoregressive process and/or depend on explanatory variables. See Hamilton (1994), Chapter 13 for a foundation for structuring and estimating such models.
 13.
Kim and Nelson (1999a) use the method to find regime switches in the volatility of shocks that drive the business cycle. Many authors after him have used it to detect changes in monetary policy and/or macroeconomic volatility.
 14.
For the standard pooled panel regression model, the assumption that the error term is normally distributed is not necessary. We make this assumption at the introduction of the model because it will be necessary in order to estimate the regimeswitching panel model by maximum likelihood.
References
Bai J, Perron P (1998) Estimating and testing linear models with multiple structural changes. Econometrica 66:47–78
Burger JD, Walters SJK (2003) Market size, pay, and performance: a general model and application to major league baseball. J Sports Econ 4:108–225
Fort R (1992) Pay and performance: is the field of dreams barren? In: Sommers PM (eds) Diamonds are forever: the business of baseball. Brookings, Washington, DC, pp 134–162
Frank RH (1984) Are workers paid their marginal products? Am Econ Rev 74:s 549–571
Hamilton J (1989) A new approach to the economic analysis of nonstationary time series and the business cycle. Econometrica 57:357–384
Hamilton J (1994) Time series analysis. Princeton University Press, Princeton
Haupert MJ (2009) Player pay and productivity in the reserve clause and collusion eras. Nine J Baseball Hist Cult 18:63–85
Hoaglin DC, Velleman PF (1995) A critical look at some analyses of major league baseball salaries. Am Stat 49:277–285
Kahn LM (1993) Free agency, longterm contracts and compensation in major league baseball: estimates from panel data. Rev Econ Stat 75:157–164
Kim CJ (1994) Dynamic linear models with Markovswitching. J Econom 60:1–22
Kim CJ, Nelson CR (1999a) Has the US economy become more stable? A Bayesian approach based on a Markovswitching model of the business cycle. Rev Econ Stat 81:608–616
Kim CJ, Nelson CR (1999b) Statespace models with regime switching: classical and Gibbssampling approaches with applications. MIT Press, Cambridge
Krautmann AC (1999) What’s wrong with Scullyestimates of a player’s marginal revenue product? Econ Inq 37:369–381
Krautmann AC, Gustafson E, Hadley L (2003) A note on the structural stability of salary equations: major league baseball pitchers. J Sports Econ 4:56–63
Krautmann AC, Oppenheimer M (2002) Contract length and the return to performance in major league baseball. J Sports Econ 3:6–17
MacDonald DN, Reynolds MO (1994) Are baseball players paid their marginal products? Manag Decis Econ 15:443–457
Miller M (1991) A whole different game: the sport and business of baseball. Carol Publishing Group, Seacaucus
Scully GW (1974) Pay and performance in major league baseball. Am Econ Rev 64:915–930
Scully GW (1989) The business of major league baseball. University of Chicago Press, Chicago
Seymour H (1960) Baseball: the early years. Oxford University Press, New York
Zimbalist A (1992) Salaries and performance: beyond the Scully model. In: Sommers PM (eds) Diamonds are forever: the business of baseball. Brookings, Washington, DC, pp 109–133
Author information
Affiliations
Corresponding author
Additional information
For very helpful comments, we would like to thank Kevin Quinn, David Surdam, and the participants at the following conferences: Cliometrics Society session of WEAI 2010 annual meeting, Economic and Business Historical Society 2011 annual conference, and the 7th BETA Workshop in Historical Economics. We would also like to thank Jake Kimmet, Tony Lyga, and Eric Streske for valuable research assistance. All errors are our own.
Appendices
Appendix 1: Filtering procedure
Hamilton (1989) describes an iterative procedure to evaluate a likelihood function for Markov regime switching for a single time series. In this appendix, we describe how we extend his method to a pooled panel regression model. Consider the following pooled regression model with regime switching,
where subscript i denotes a given individual, subscript t denotes a given time period, x _{ i,t } is a vector of explanatory variables that may include both variables that vary across time for an individual and variables that remain constant over time. The regime state is given by \(s_t \in \{1,\ldots,S\}\), where S is the number of regimes. The vector of coefficients is given by β(s _{ t }) = β_{ k }, if s _{ t } = k, and the error term is independently and identically normally distributed, e _{ i,t } ∼ N[0, σ(s _{ t })], where the standard deviation is given by σ(s _{ t }) = σ_{ k }, if s _{ t } = k. The regime state, s _{ t }, evolves according to the Markov chain, \(P(s_t=k\,\, s_{t1}=j, \Uppsi_{t1}) = p_{jk},\) where p _{ jk } denotes the probability the economy switches from state j to state k as time enters period t and is another parameter to be estimated along with the other regression parameters, and \(\Uppsi_{t1}\) simply denotes all information up through period t − 1.
Given the error term e _{ i,t } is normally distributed, if s _{ t } = k was known, the probability density function for y _{ i,t } is given by,
Let \(f(y_{t} \,\, \Uppsi_{t1})\) denote the joint unconditional density function for all observations of the dependent variable in time t, where y _{ t } denotes the set of observations for every individual at period \(t, y_t \equiv \{y_{1,t}, y_{2,t}, \ldots, y_{n_t,t}\},\) and n _{ t } is the number of individual for which data are available at time t. Each iteration begins with the input \(P(S_{t1}=j  \Uppsi_{t1})\) for every \(j\in\{1,\ldots,S\}\) and has the output \(P(S_t=k  \Uppsi_t)\), and the process requires an initial condition for P(S _{0} = j). The filtering procedure takes as given the parameters β_{ k }, σ_{ k }, and p _{ jk } for all j, k. Maximum likelihood estimates for these parameters can be obtained by maximizing the joint density function for all the data (the output from the filtering procedure) with respect to these parameters. The filtering algorithm follows these steps:

Step 1: Find probabilities for being in each regime in time t, given information up through period t − 1. These probabilities are given by,
$$ P(s_t=k\,\,\Uppsi_{t1}) = \sum_{j=0}^{S} P(s_t=k  s_{t1}=j) P(s_{t1}=j  \Uppsi_{t1}), $$where P(s _{ t } = k  s _{ t1} = j) ≡ p _{ jk } is the Markov switching parameter, and \(P(s_{t1}=j  \Uppsi_{t1})\) is known from the previous iteration (or initial condition).

Step 2: Evaluate the conditional joint density function \(f(y_t \,\, \Uppsi_{t1})\) which is computed by evaluating the following successive densities:
$$ \begin{array}{l} f(y_t \,\, s_t=k, \Uppsi_{t1}) = \prod\limits_{i=1}^{n_t} f(y_{i,t} \,\, s_t=k, \Uppsi_{t1}), \\ f(y_t \,\, \Uppsi_{t1}) =\sum\limits_{k=1}^{S} f(y_t \,\, s_t=k, \Uppsi_{t1}) P(s_t=k \,\, \Uppsi_{t1}). \end{array} $$The first equation is valid since e _{ i,t } and e _{ i′,t } are independent for i ≠ i′, and the density \(f(y_{i,t} \,\, s_t=k, \Uppsi_{t1})\) is given in Eq. 5. In the second equation, \(P(s_t=k \,\, \Uppsi_{t1})\) is given from step 1.

Step 3: Evaluate the updated probability for being in each regime in time t, given information up through period t − 1. These probabilities are given by,
$$ \begin{aligned} P(s_t=k\,\,\Uppsi_{t}) &= P(s_t=k\,\,y_t, \Uppsi_{t1}) = \frac{f(y_t, s_t=k  \Uppsi_{t1})}{f(y_t  \Uppsi_{t1})} \\ & = \frac{f(y_t \,\, s_t=k, \Uppsi_{t1}) P(s_t=k  \Uppsi_{t1})}{f(y_t\Uppsi_{t1})}, \end{aligned} $$where the densities and probability needed to evaluate the second line are given in steps 1 and 2.

Step 4: Return to step 1 until t = T, where T is the number of periods in the sample. The joint distribution for all the data is given by,
$$ f(y^T  \Uppsi_{T1}) = \prod_{t=1}^{T} f(y_t \,\, \Uppsi_{t1}), $$(6)where \(f(y_t \,\, \Uppsi_{t1})\) is given from step 2. Taking logs, this can be transformed to the loglikelihood function,
$$ l(y^T) = \sum_{t=1}^{T} \log \left( f(y_t \,\, \Uppsi_{t1}) \right). $$(7)Numerical maximization methods can be used to maximize Eq. 7 to obtain maximum likelihood estimates for β(s _{ t }) and σ^{2}(s _{ t }) and transition probabilities p _{ j,k }.
Appendix 2: Smoothing procedure
Once estimates for β(s _{ t }), σ^{2}(s _{ t }) and all the transition probabilities are obtained, one may use the results from the filtering method to obtain smoothed estimates for \(P(s_t=j  \Uppsi_T),\) the expected probability of being in each state for every period in the sample, using all the information from the sample. The smoothing procedure described here is unchanged from Hamilton (1989) and is described again here for convenience.
The smoothing procedure begins at the end of the sample period, and each iteration computes \(P(s_t=k\Uppsi_T)\) as its output from period t = T − 1 to t = 1, taking the output of the previous iteration, \(P(s_{t+1}=l\Uppsi_T),\) as an input. The starting value, \(P(s_T=k  \Uppsi_T)\) is given from the output of Step 3 in the filtering procedure above for time t = T.

Step 1: Compute conditional density \(P(s_t=k  s_{t+1}=l, \Uppsi_t)\) based on output from the filtering procedure:
$$ P(s_t=k  s_{t+1}=l, \Uppsi_t) = \frac{P(s_t=k, s_{t+1}=l  \Uppsi_t)}{P(s_{t+1}=l  \Uppsi_t)} = \frac{P(s_t=k,  \Uppsi_t) P(s_{t+1}=l  s_t=k)}{P(s_{t+1}=l  \Uppsi_t)}. $$Both \(P(s_{t+1}=l  \Uppsi_t)\) and \(P(s_{t}=k  \Uppsi_t)\) in the last expression are known from Step 1 of the filtering procedure and P(s _{ t+1} = ls _{ t } = k) is the known Markov transition probability.

Step 2: Approximate the full information joint density \(P(s_t=k, s_{t+1}=l  \Uppsi_T)\) according to,
$$ \begin{aligned} P(s_t=k, s_{t+1}=l  \Uppsi_T) &= P(s_{t+1}=l  \Uppsi_T) P(s_t=k  s_{t+1}=l, \Uppsi_T) \\ & \approx P(s_{t+1}=l  \Uppsi_T) P(s_t=k  s_{t+1}=l, \Uppsi_t). \end{aligned} $$In the second expression, \(P(s_{t+1}=l  \Uppsi_T)\) is known from the previous iteration of the loop (or the initial condition) and \(P(s_t=k  s_{t+1}=l, \Uppsi_t)\) is the output from Step 1.

Step 3: The unconditional density \(P(s_t=k\Uppsi_T)\) is given by,
$$ P(s_t=k\Uppsi_T) = \sum_{l=1}^{S} P(s_t=k, s_{t+1}=l  \Uppsi_T). $$ 
Step 4: Return to Step 1 until t = 1.
Rights and permissions
About this article
Cite this article
Haupert, M., Murray, J. Regime switching and wages in major league baseball under the reserve clause. Cliometrica 6, 143–162 (2012). https://doi.org/10.1007/s1169801100672
Received:
Accepted:
Published:
Issue Date:
Keywords
 Major League Baseball
 Salary determination
 Markov regime switching
JEL classification
 C22
 C23
 J31