Summary:
This paper describes common features in data sets from motor vehicle insurance companies and proposes a general approach which exploits knowledge of such features in order to model high–dimensional data sets with a complex dependency structure. The results of the approach can be a basis to develop insurance tariffs. The approach is applied to a collection of data sets from several motor vehicle insurance companies. As an example, we use a nonparametric approach based on a combination of two methods from modern statistical machine learning, i.e. kernel logistic regression and ε-support vector regression.
Similar content being viewed by others
Author information
Authors and Affiliations
Corresponding author
Additional information
*This work was supported by the Deutsche Forschungsgemeinschaft (SFB 475, “Reduction of complexity in multivariate data structures”) and by the Forschungsband Do-MuS from the University of Dortmund. I am grateful to Mr. A. Wolfstein and Dr. W. Terbeck from the Verband öffentlicher Versicherer in Düsseldorf, Germany, for making available the data set and for many helpful discussions.
Rights and permissions
About this article
Cite this article
Christmann*, A. An approach to model complex high–dimensional insurance data. Allgemeines Statistisches Arch 88, 375– 396 (2004). https://doi.org/10.1007/s101820400178
Received:
Revised:
Issue Date:
DOI: https://doi.org/10.1007/s101820400178
Keywords:
- Classification
- data mining
- insurance tariffs
- kernel logistic regression
- machine learning
- regression
- robustness
- simplicity
- support vector machine
- support vector regression