Skip to main content

Testing Shape Constraints in Lasso Regularized Joinpoint Regression

Part of the Springer Proceedings in Mathematics & Statistics book series (PROMS,volume 193)

Abstract

Joinpoint regression models are very popular in some practical areas mainly due to a very simple interpretation which they offer. In some situations, moreover, it turns out to be also useful to require some additional qualitative properties for the final model to be satisfied. Especially properties related to the monotonicity of the fit are of the main interest in this paper. We propose a LASSO regularized approach to estimate these types of models where the optional shape constraints can be implemented in a straightforward way as a set of linear inequalities and they are considered simultaneously within the estimation process itself. As the main result we derive a testing approach which can be effectively used to statistically verify the validity of the imposed shape restrictions in piecewise linear continuous models. We also investigate some finite sample properties via a simulation study.

Keywords

  • Joinpoint regression
  • Regularization
  • Shape constraints
  • Post-selection inference

This is a preview of subscription content, access via your institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • DOI: 10.1007/978-3-319-51313-3_6
  • Chapter length: 18 pages
  • Instant PDF download
  • Readable on all devices
  • Own it forever
  • Exclusive offer for individuals only
  • Tax calculation will be finalised during checkout
eBook
USD   99.00
Price excludes VAT (USA)
  • ISBN: 978-3-319-51313-3
  • Instant PDF download
  • Readable on all devices
  • Own it forever
  • Exclusive offer for individuals only
  • Tax calculation will be finalised during checkout
Softcover Book
USD   129.00
Price excludes VAT (USA)
Hardcover Book
USD   129.00
Price excludes VAT (USA)
Fig. 1

Notes

  1. 1.

    Different names can be used in literature to refer to joinpoint regression models, among others, for instance segmented regression models, piecewise linear models, threshold models, sequential linear models, etc.

  2. 2.

    If the shape constraints in (6) refer to the monotonic property of the final fit (e.g.non-decreasing function), the corresponding matrix \(\varvec{A}\) equals to \(\varvec{A}_{1}\) below. If the constraints in (6) are supposed to refer to isotonic property (e.g. convex function) the corresponding matrix \(\varvec{A}\) should be equal to \(\varvec{A}_{2}\) (the estimate for the overall slope \(\beta _{1}\) is irrelevant for isotonic properties of the final fit, thus the first line in \(\varvec{A}_{2}\) are either zeros or it can be deleted with \(\varvec{\beta }_{(-2)}\) being considered instead of \(\varvec{\beta }_{(-1)}\)).

    \(\varvec{A}_{1} = \left( \begin{array}{ccccc} 1 &{} 0 &{} \dots &{} \dots &{} 0\\ 1 &{} 1 &{} 0 &{} \dots &{} 0 \\ \vdots &{} \vdots &{} \ddots &{} \vdots &{} \vdots \\ 1 &{} \dots &{} 1 &{} 1 &{} 0\\ 1 &{} \dots &{} \dots &{} 1 &{} 1\\ \end{array}\right) \in \mathbb {R}^{(n - 1) \times (n - 1)}\) and \(\varvec{A}_{2} = \left( \begin{array}{ccccc} 0 &{} 0 &{} \dots &{} \dots &{} 0\\ 0 &{} 1 &{} 0 &{} \dots &{} 0 \\ \vdots &{} \vdots &{} \ddots &{} \vdots &{} \vdots \\ 0 &{} \dots &{} 0 &{} 1 &{} 0\\ 0 &{} \dots &{} \dots &{} 0 &{} 1\\ \end{array}\right) \in \mathbb {R}^{(n - 1) \times (n - 1)}\). Analogous matrices for other scenarios can be obtained in similar ways as an easy exercise.

  3. 3.

    In order to solve the minimization problem in (6) with respect to the constraints stated in (7) we use The MOSEK optimization toolbox and Mosek-to-R interface available in the R package Rmosek.

  4. 4.

    By LASSO path history we understand a sequence of variables which progressively enter the model as we proceed with estimation. In each step of the estimation procedure only one parameter can either enter the current model (if it is not in the model yet) or an active parameter steps off the model (if it was active). Only one of these events can happen at each step. The LASSO history follows this entering/stepping off process starting with a zero model where all regularized parameters are set to zero.

  5. 5.

    The vector of estimated signs \(\hat{\varvec{s}}\) is only relevant for the LASSO regularized estimates in \(\varvec{\beta }_{(-2)}\) and it holds that \(\hat{s}_{j} = sign(\hat{\beta }_{j + 1})\), for all \(j = 1, \dots , n - 2\).

  6. 6.

    By a truncated normal distribution we understand a distribution with a cumulative distribution function \(F_{\mu , \sigma ^2}^{[a,b]}(x)\) truncated to the interval \([a, b] \subset \mathbb {R}\) where \(F_{\mu , \sigma ^2}^{[a,b]}(x) = \frac{\phi ((x - \mu )/\sigma ) - \phi ((a - \mu )/\sigma )}{\phi ((b - \mu )/\sigma ) - \phi ((a - \mu )/\sigma )}\), for \(\phi (\cdot )\) being the cumulative distribution function of the standard Gaussian random variable.

  7. 7.

    In case of the isotonic constraints where the intercept parameter does not play any role in the test one can consider the set of indexes for active parameters only, the set \(\mathscr {A}\).

  8. 8.

    One can assume different shape constraints. In this paper, however, we only present a small part of the simulation results for this specific restriction.

References

  1. Bosetti, C., Bertuccio, P., Levi, F., Lucchini, F., Negri, E., La Vecchia, C.: Cancer mortality in the European Union, 1970–2003, with a joinpoint analysis. Ann. Oncol. 16, 631–640 (2008)

    Google Scholar 

  2. Carlin, B.P., Gelfand, A.E., Smith, A.F.M.: Hierarchical Bayesian analysis of changepoint problems. Appl. Stat. 41, 389–405 (1992)

    CrossRef  MATH  Google Scholar 

  3. Efron, B., Hastie, T., Tibshirani, R.: Least angle regression. Ann. Stat. 32, 407–499 (2004)

    MathSciNet  CrossRef  MATH  Google Scholar 

  4. Feder, P.: The log likelihood ratio in segmented regression. Ann. Stat. 3, 84–97 (1975)

    MathSciNet  CrossRef  MATH  Google Scholar 

  5. Friedman, J., Hastie, T., Tibshirani, R.: Regularization paths for generalized linear models via coordinate descent. J. Stat. Softw. 33, 1–22 (2010)

    CrossRef  Google Scholar 

  6. Gareth, M.J., Paulson, C., Rusmevichientong, P.: The Constrained Lasso. (working paper)

    Google Scholar 

  7. Harchaoui, Z., Lévy-Leduc, C.: Multiple change-point estimation with a total variation penalty. J. Am. Stat. Assoc. 105(492), 1480–1493 (2010)

    MathSciNet  CrossRef  MATH  Google Scholar 

  8. Hinkley, D.V.: Inference in two-phase regression. J. Am. Stat. Assoc. 66(336), 736–743 (1971)

    CrossRef  MATH  Google Scholar 

  9. Holm, S.: A simple sequentially rejective multiple test procedure. Scand. J. Stat. 6(2), 65–70 (1979)

    MathSciNet  MATH  Google Scholar 

  10. Kim, H.J., Yu, B., Feuer, E.J.: Selecting the number of change-poins in segmented line regression. Stat. Sin. 19, 597–609 (2009)

    MATH  Google Scholar 

  11. Kim, H.J., Fay, M.P., Feuer, E.J., Midthune, D.N.: Permutation tests for joinpoint regression with application to cancer rates. Stat. Med. 19, 335–351 (2000)

    CrossRef  Google Scholar 

  12. Lee, J.D., Sun, D.,L., Sun, Y., Taylor, J.,E.: Exact post-selection inference with application to the lasso. Ann. Stat. 44(3), 907–927 (2016)

    Google Scholar 

  13. Lockhart, R., Taylor, J., Tibshirani, R.J., Tibshirani, R.: A significance test for LASSO. Ann. Stat. 42(2), 413–468 (2014)

    MathSciNet  CrossRef  MATH  Google Scholar 

  14. Maciak, M., Mizera, I.: Regularization techniques in joinpoint regression. Statistical Papers (in revision) (2016)

    Google Scholar 

  15. Martinez-Beneito, M., García-Donato, G., Salmerón, D.: A bayesian joinpoint regression model with an unknown number of break-points. Ann. Appl. Stat. 5(3), 2150–2168 (2011)

    MathSciNet  CrossRef  MATH  Google Scholar 

  16. Meinshausen, N., Bühlmann, P.: Stability selection. J. R. Stat. Soc. B 72(4), 417–473 (2010)

    MathSciNet  CrossRef  Google Scholar 

  17. Miller, R.: Simultaneous Statistical Inference, 2nd edn. Springer, New York (1981)

    CrossRef  MATH  Google Scholar 

  18. Pesarin, F.: Multivariate Permutations Tests with Application in Biostatistics. Wiley, New York (2001)

    Google Scholar 

  19. Qui, D., Katanoda, K., Tomomi, M., Tomotaka, S.: A joinpoint regression analysis of long-term trends in cancer mortality in Japan (1958–2004). Int. J. Cancer 24, 443–448 (2009)

    Google Scholar 

  20. Tibshirani, R.: Regression shrinkage and selection via the LASSO. J. R. Stat. Soc. Ser. B 58(1), 267–288 (1996)

    MathSciNet  MATH  Google Scholar 

  21. Tibshirani, R., Taylor, J., Lockhart, R., Tibshirani. R.: Exact post-selection inference for sequential regression procedures. J. Am. Stat. Assoc. (to appear)

    Google Scholar 

  22. van de Geer, S., Bühlman, P., Ritov, Y., Dezeure, R.: On asymptotically optimal confidence regions and tests for high-dimensional models. Ann. Stat. 42(3), 1166–1201 (2014)

    MathSciNet  CrossRef  MATH  Google Scholar 

  23. Zhang, C.H., Zhang, S.: Confidence intervals for low dimensional parameters in high dimensional linear models. J. R. Stat. Soc.: Ser. B 76(1), 217–242 (2014)

    MathSciNet  CrossRef  Google Scholar 

Download references

Acknowledgements

We thank Ivan Mizera, Sen Bodhisattva and the referee for their comments and remarks. This work was partially supported by the PRVOUK grant 300-04/130.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Matúš Maciak .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and Permissions

Copyright information

© 2017 Springer International Publishing AG

About this paper

Cite this paper

Maciak, M. (2017). Testing Shape Constraints in Lasso Regularized Joinpoint Regression. In: Antoch, J., Jurečková, J., Maciak, M., Pešta, M. (eds) Analytical Methods in Statistics. AMISTAT 2015. Springer Proceedings in Mathematics & Statistics, vol 193. Springer, Cham. https://doi.org/10.1007/978-3-319-51313-3_6

Download citation