Abstract
Insurance companies have started to collect high-frequency GPS car driving data to analyze the driving styles of their policyholders. In previous work, we have introduced speed and acceleration heatmaps. These heatmaps were categorized with the K-means algorithm to differentiate varying driving styles. In many situations it is useful to have low-dimensional continuous representations instead of unordered categories. In the present work we use singular value decomposition and bottleneck neural networks (autoencoders) for principal component analysis. We show that a two-dimensional representation is sufficient to re-construct the heatmaps with high accuracy (measured by Kullback–Leibler divergences).
This is a preview of subscription content, log in to check access.
References
- 1.
Ayuso M, Guillen M, Pérez-Marín AM (2016). Telematics and gender discrimination: some usage-based evidence on whether men’s risk of accidents differs from women’s. Risks 4/2, article 10
- 2.
Gao G, Meng S, Wüthrich MV (2018) Claims frequency modeling using telematics car driving data. Scand Actuarial. https://doi.org/10.1080/03461238.2018.1523068 (to appear)
- 3.
Hainaut D (2018) A neural-network analyzer for mortality forecast. ASTIN Bull 48(2):481–508
- 4.
Hastie T, Tibshirani R, Friedman J (2009) The elements of statistical learning. Data mining, inference, and prediction, 2nd edn. Springer Series in Statistics, Berlin
- 5.
Hinton GE, Salakhutdinov RR (2006) Reducing the dimensionality of data with neural networks. Science 313:504–507
- 6.
Kramer MA (1991) Nonlinear principal component analysis using autoassociative neural networks. AIChE J 37(2):233–243
- 7.
Liou CY, Cheng CW, Liou JW, Liou DR (2014) Autoencoders for words. Neurocomputing 139:84–96
- 8.
Verbelen R, Antonio K, Claeskens G (2018) Unraveling the predictive power of telematics data in car insurance pricing. J Roy Stat Soc Ser C (Appl Stat) (to appear)
- 9.
Weidner W, Transchel FWG, Weidner R (2016) Classification of scale-sensitive telematic observables for riskindividual pricing. Eur Actuar J 6(1):3–24
- 10.
Weidner W, Transchel FWG, Weidner R (2016) Telematic driving profile classification in car insurance pricing. Ann Actuar Sci 11(2):213–236
- 11.
Wüthrich (2017) Covariate selection from telematics car driving data. Eur Actuar J 7(1):89–108
- 12.
Wüthrich MV, Buser C (2016) Data analytics for non-life insurance pricing. SSRN Manuscript ID 2870308. Version October 25, 2017
Acknowledgements
Guangyuan Gao: Financially supported by the Social Science Fund of China (Grant no. 16ZDA052) and MOE National Key Research Bases for Humanities and Social Sciences (Grant no. 16JJD910001).
Author information
Affiliations
Corresponding author
Appendix: KL divergence, revisited
Appendix: KL divergence, revisited
In this appendix we briefly revisit the KL divergence. Denote by \(\mathcal{X} \subset {\mathbb {R}}^J\) the \((J-1)\)-unit simplex. Consider \(k\) independent and identically distributed trials among \(J\) classes providing a multinomial distribution \(\pi \in \mathcal{X}\) given by the discrete probability weights
for \(k_j \in {\mathbb {N}}_0\), \(j=1,\ldots , J\). The deviance statistics of an observation \((k_1,\ldots ,k_J)\) of that multinomial distribution is given by
In Sect. 3 we have defined the empirical distributions on the \((J-1)\)-unit simplex by setting \(x_j=k_j/k\) which, of course, provides \(\varvec{x}=(x_1,\ldots , x_J)'\in \mathcal{X}\). Doing this transformation we receive
Thus, by minimizing the KL divergence in (3.3), we minimize the corresponding deviance statistics, which provides the maximum likelihood estimator of the network parameter \(\theta\) under independent multinomial models (having \(J\) classes) for drivers \(i=1,\ldots , n\). This additionally assumes that all drivers have identical weights \(k\). If the latter is not appropriate we may replace the average KL divergence in (3.3) by a weighted counterpart
with weights \(w_i\ge 0\) satisfying \(\sum _{i=1}^n w_i=1\).
Rights and permissions
About this article
Cite this article
Gao, G., Wüthrich, M.V. Feature extraction from telematics car driving heatmaps. Eur. Actuar. J. 8, 383–406 (2018). https://doi.org/10.1007/s13385-018-0181-7
Received:
Revised:
Accepted:
Published:
Issue Date:
Keywords
- Telematics car driving data
- Driving styles
- Unsupervised learning
- Pattern recognition
- Image recognition
- Bottleneck neural network
- Autoencoder
- Singular value decomposition
- Principal component analysis
- K-means algorithm
- Kullback–Leibler divergence