Skip to main content
Log in

Assessing the Importance of Risk Factors in Distance-Based Generalized Linear Models

  • Published:
Methodology and Computing in Applied Probability Aims and scope Submit manuscript

Abstract

Predictions with distance-based linear and generalized linear models rely upon latent variables derived from the distance function. This key feature has the drawback of adding a non-linearity layer between observed predictors and response which shields one from the other and, in particular, prevents us from interpreting linear predictor coefficients as influence measures. In actuarial applications such as credit scoring or a priori rate-making we cannot forgo this capability, crucial to assess the relative leverage of risk factors. Towards the goal of recovering this functionality we define and study influence coefficients, measuring the relative importance of observed predictors. Unavoidably, due to inherent model non-linearities, these quantities will be local -valid in a neighborhood of a given point in predictor space.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Boj E, Claramunt MM, Fortiana J (2007) Selection of predictors in distance-based regression. Commun Stat A Theory Methods 36:87–98

    MATH  Google Scholar 

  • Boj E, Delicado P, Fortiana J (2010) Local linear functional regression based on weighted distance-based regression. Comput Stat Data Anal 54:429–437

    Article  MathSciNet  MATH  Google Scholar 

  • Boj E, Delicado P, Fortiana J, Esteve A, Caballé A (2012) Local distance-based generalized linear models using the dbstats package for r. Documentos de Trabajo de la Xarxa de Referència en Economia Aplicada (XREAP) 11, http://www.pcb.ub.edu/xreap/aplicacio/fitxers/XREAP2012-11.pdf

  • Boj E, Caballé A, Delicado P, Fortiana J (2013) dbstats: distance-based statistics (dbstats). http://CRAN.R-project.org/package=dbstats, r package version 1.0.3

  • Brockman MJ, Wright TS (1992) Statistical motor rating: making effective use of your data. J Inst Actuar 119(3):457–543

    Article  Google Scholar 

  • Cuadras CM (1989) Distance analysis in discrimination and classification using both continuous and categorical variables. In: Dodge Y (ed) Statistical data analysis and inference. North-Holland, Amsterdam, pp 459–473

    Google Scholar 

  • Cuadras C, Arenas C (1990) A distance-based regression model for prediction with mixed data. Commun Stat A Theory Methods 19:2261–2279

    Article  MathSciNet  Google Scholar 

  • Cuadras CM, Arenas C, Fortiana J (1996) Some computational aspects of a distance-based model for prediction. Commun Stat B Simul Comput 25:593–609

    Article  MATH  Google Scholar 

  • Davidson A, Hinkley D (1997) Bootstrap methods and their application. Cambridge University Press, New York

    Book  Google Scholar 

  • Efron B, Tibshirani J (1998) An introduction to the Bootstrap. Chapman and Halls, New York

    Google Scholar 

  • Esteve A, Boj E, Fortiana J (2009) Interaction terms in distance-based regression. Commun Stat A Theory Methods 38:3498–3509

    Article  MathSciNet  MATH  Google Scholar 

  • Flachaire E (1999) A better way to bootstrap pairs. Econ Lett 64:257–262

    Article  MathSciNet  MATH  Google Scholar 

  • Flachaire E (2005) Bootstrapping heteroskedastic regression models: wild bootstrap vs. pairs bootstrap. Comput Stat Data Anal 49:361–376

    Article  MathSciNet  MATH  Google Scholar 

  • Gower JC (1971) A general coefficient of similarity and some of its properties. Biometrics 27:857–874

    Article  Google Scholar 

  • Gower J, Harding S (1988) Nonlinear biplots. Biometrika 75:445–455

    Article  MathSciNet  MATH  Google Scholar 

  • Haberman S, Rensahw AE (1996) Generalized linear models and actuarial science. J R Stat Soc Ser D (The Statistician) 45(4):407–436

    Google Scholar 

  • Hallin M, Ingenbleek JF (1983) The Swedish automobile portfolio in 1977. A statistical study. Skand Aktuarietidskr (Scand Actuar J) 83:49–64

    Article  Google Scholar 

  • MacKinnon J (2002) Bootstrap inference in econometrics. Can J Econ 35(4):615–645

    Article  MathSciNet  Google Scholar 

  • MacKinnon J (2006) Bootstrap methods in econometrics. Econ Rec 82:s2–s18

    Article  Google Scholar 

  • MacKinnon J (2007) Bootstrap hypothesis testing. Queen’s Economics Department Working Paper 1127

  • McCullagh P, Nelder JA (1989) Generalized linear models, 2nd edn. Chapman and Hall, London

    Book  MATH  Google Scholar 

  • R Development Core Team (2013) R: a language and environment for statistical computing. Vienna, Austria, http://www.R-project.org/

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Eva Boj.

Additional information

Work supported in part by the Spanish Ministerio de Educación y Ciencia and FEDER, grant MTM2010-17323, and by Generalitat de Catalunya, AGAUR grant 2014SGR152.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Boj, E., Costa, T., Fortiana, J. et al. Assessing the Importance of Risk Factors in Distance-Based Generalized Linear Models. Methodol Comput Appl Probab 17, 951–962 (2015). https://doi.org/10.1007/s11009-014-9415-6

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11009-014-9415-6

Keywords

AMS 2000 Subject Classification

Navigation