Skip to main content
  • 691 Accesses

Abstract

Lester Taylor (Taylor and Houthakker 2010) instilled a deep respect for estimating parameters in statistical models by minimizing the sum of absolute errors (the L1 criterion) as an important alternative to minimizing the sum of the squared errors (the ordinary least squares or OLS criterion).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    An empirical suggestion of the usefulness of piecewise linear models is that a Google search on “piecewise linear regression” turned up hundreds of thousands of hits. Similarly large numbers of hits occur for synonyms such as “broken stick regression”, “two phase regression”, “broken line regression”, “segmented regression”, “switching regression”, “linear spline”, and the Canadian and Russian preference, “hockey stick regression”.

  2. 2.

    The author has not applied any of these to piecewise linear estimation.

References

  • Bassett G, Koenker R (1978) Asymptotic theory of least absolute error regression. J Am Stat Assoc 73:618–622

    Article  Google Scholar 

  • Breiman L (1993) Hinging hyperplanes for regression, classification, and function approximation. IEEE Trans Inf Theory 39(3):999–1013

    Article  Google Scholar 

  • Charnes A (1952) Optimality and degeneration in linear programing. Econometrica 20(2):160–170

    Google Scholar 

  • Charnes A, Cooper WW, Ferguson RO (1955) Optimal estimation of executive compensation by linear programming. Manage Sci 1:138–151

    Article  Google Scholar 

  • Cogger K (2010) Nonlinear multiple regression methods: a survey and extensions. Intell Syst Account Finance Manage 17:19–39

    Article  Google Scholar 

  • Galton F (1886) Regression towards mediocrity in hereditary stature. J Anthropol Inst Great Br Irel 15:246–263

    Article  Google Scholar 

  • Geoffrion A (1972) Perspectives on optimization. Addison-Wesley, Reading

    Google Scholar 

  • Glover F (1986) Future paths for integer programming and links to artificial intelligence. Comput Oper Res 13:533–549

    Article  Google Scholar 

  • Hillier F, Lieberman G (1970) Introduction to operations research. Holden-Day, San Francisco

    Google Scholar 

  • Hinkley D (1969) Inference about the intersection in two-phase regression. Biometrika 56:495–504

    Article  Google Scholar 

  • Hinkley D (1971) Inference in two-phase regression. J Am Stat Assoc 66:736–743

    Article  Google Scholar 

  • Hudson D (1966) Fitting segmented curves whose join points have to be estimated. J Am Stat Assoc 61:1097–1129

    Article  Google Scholar 

  • Koenker R, Bassett G (1978) Regression quantiles. Econometrics 16:33–50

    Article  Google Scholar 

  • Nelder J, Mead R (1965) A simplex method for function minimization. Comput J 7:308–313

    Article  Google Scholar 

  • Taylor L, Houthakker H (2010) Consumer demand in the United States: prices, income, and consumer behavior. Kluwer Academic Publishers, Netherlands

    Book  Google Scholar 

  • Wachsmuth A, Wilkinson L, Dallal G (2003) Galton’s Bend: a previously undiscovered nonlinearity in Galton’s family stature regression data. Am Stat 57(3):190–192

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Kenneth O. Cogger .

Editor information

Editors and Affiliations

Appendices

Appendix 1. Standard L1 or QR Multiple Regression

The estimation of a single multiple regression with L1 or QR is the following LP problem:

$$ {\text{min}}:\sum\limits_{i = 1}^{n} {\left( {\theta e_{i + } + (1 - \theta )e_{i - } } \right)}. $$

Such that:

$$ y_{i} - \beta^{\prime } x_{i} = e_{i + } - e_{i - } ;\;\forall i $$
$$ e_{i + } \ge 0;\;\forall i $$
$$ e_{i - } \ge 0;\;\forall i, $$
β:

unrestricted.

In this primal LP problem, the \( x_{i} \) are known p-vectors and the \( y_{i} \) are known scalar values. \( \beta \) is a p-vector of decision variables. For L1, choose θ = 0.5; for QR choose any \( \theta \in \left[ {0,\,1} \right] \). This well-known LP formulation has 2n + p decision variables and n linear equality constraints. For this primal LP formulation, duality theory applies and the dual LP problem is:

$$ {\text{max}}:\sum\limits_{i = 1}^{n} {\lambda_{i} } y_{i}. $$

Such that:

$$ X^{\prime } \lambda = 0 $$
$$ \theta - 1 \le \lambda_{i} \le \theta :\;\forall i. $$

This LP problem has n decision variables, p linear equality constraints, and n bounded variables, so it is usually a bit faster to solve for large n. Importantly, the optimal values in \( \lambda \) may be associated with important test statistics developed by Koenker and Bassett.

Appendix 2. L1 or QR Piecewise Multiple Regression with Known Hinges

With one known hinge, Eq. (2.2) describes the predictor and Eq. (2.3) defines the hinge. Let x be a p-vector of known values of independent variables. Typically, the first element of x is unity for a constant term in the multiple regression. The hinge given by Eq. (2.3) will then be a p-vector, H, which is here assumed known. Define the p-vector \( z = \left\{ {\begin{array}{*{20}l} {x;} & {x \le H} \\ {x - H;} & {x > H} \\ \end{array} } \right\} \) with individual calculations for each element of x and H. Since x and H are known, z has known values. This results in the LP problem:

$$ {\text{min}}:\sum\limits_{i = 1}^{n} {\left( {\theta e_{i + } + (1 - \theta )e_{i - } } \right)} $$

Such that:

$$ y_{i} - \beta_{1}^{\prime } x_{i} - \beta_{2}^{\prime } z_{i} = e_{i + } - e_{i - } ;\;\forall i $$
$$ e_{i + } \ge 0;\;\forall i $$
$$ e_{i - } \ge 0;\;\forall i $$
β1, β2 :

unrestricted.

For more than one known hinge, this LP can be easily extended; simply add additional \( \beta \) vectors and additional z vectors for each additional hinge to the formulation.

Appendix 3. L1 or QR Piecewise Multiple Regression with Unknown Hinges

The solution for H = 1 hinge and two pieces is clearly found with the MILP formulation in the second section. Let this solution be denoted by \( \hat{y}_{i} = \hat{y}(1)_{i} ;\;\forall i \) [with notation changes to Eq. (2.7)] which chooses one of the linear pieces \( \left( {\hat{y}_{1i} ,\;\hat{y}_{2i} } \right) \) as the regression for each i.

For H = 2 hinges, there are three possible pieces \( \left( {\hat{y}_{1i} ,\;\hat{y}_{2i} ,\;\hat{y}_{3i} } \right). \) This reduces to a choice between one of two linear pieces \( \left( {\hat{y}_{3i} ,\;\hat{y}(1)_{i} } \right) \) and a second set of binary variables and constraints such as Eq. (2.7) (with notation changes) enforces this choice to solve the problem for H = 2. This solution can be denoted by \( \hat{y}(2)_{i} ;\;\forall i. \)

This inductive argument can be continued for H = 3, 4, etc. For any number of hinges, an MILP formulation can be created with H(n + 1) binary variables, the main determinant of computing time.

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer Science+Business Media New York

About this chapter

Cite this chapter

Cogger, K.O. (2014). Piecewise Linear L1 Modeling. In: Alleman, J., Ní-Shúilleabháin, Á., Rappoport, P. (eds) Demand for Communications Services – Insights and Perspectives. The Economics of Information, Communication, and Entertainment. Springer, Boston, MA. https://doi.org/10.1007/978-1-4614-7993-2_2

Download citation

Publish with us

Policies and ethics