Abstract
Actuaries now collect all kinds of information about policyholders, which can not only be used to refine a premium calculation but also to carry out prevention operations. We return here to the choice of relevant variables in pricing, with emphasis on actuarial, operational, legal and ethical motivations. In particular, we discuss the idea of capturing information on the behavior of an insured person, and the difficult reconciliation with the strong constraints not only of privacy but also of fairness.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
Euclid’s treatise on plane geometry was named δεδομένα , translated as “data.”
- 2.
CNIL is the Commission Nationale de l’Informatique et des Libertés (National Commission on Informatics and Liberty), an independent French administrative regulatory body whose mission is to ensure that data privacy law is applied to the collection, storage, and use of personal data, in France.
- 3.
- 4.
- 5.
CA: California, HI: Hawaii, GA: Georgia, NC: North Carolina, NY: New York, MA: Massachusetts, PA: Pennsylvania, FL: Florida, TX: Texas.
- 6.
AL: Alberta, ON: Ontario, NB: New Brunswick, NL: Newfoundland and Labrador, QC: Québec.
- 7.
Project https://immersion.media.mit.edu/.
- 8.
Instead of the Latin formula that could designate a contract. Actually, “formula” nowadays refers to “mathematical formulas” as seen in Chaps. 3 and 4, or “magic formulas,” the two being very close for many people (see for example the introduction of O’Neil (2016) explaining how mathematics “was not only deeply entangled in the world’s problems but also fueling many of them. The housing crisis, the collapse of major financial institutions, the rise of unemployment—all had been aided and abetted by mathematicians wielding magic formulas.”
- 9.
- 10.
Equivalent, in the UK, of 911 in North America, 112 in many European countries, or 0118 999 881 999 119 725 3.
- 11.
To continue the analogy, in credit risk, we find the three previous levels, with (1) those who do not apply for credit, (2) those to whom the institution does not offer credit, and (3) those who are not interested in the offer made.
- 12.
Called “The Great AI Debate: Interpretability is necessary for machine learning,” opposing Rich Caruana and Patrice Simard (for) to Kilian Weinberger and Yann LeCun (against) https://youtu.be/93Xv8vJ2acI.
References
Abrams M (2014) The origins of personal data and its implications for governance. SSRN 2510927
Achenwall G (1749) Abriß der neuesten Staatswissenschaft der vornehmsten Europäischen Reiche und Republicken zum Gebrauch in seinen Academischen Vorlesungen. Schmidt
Alipourfard N, Fennell PG, Lerman K (2018) Can you trust the trend? discovering simpson’s paradoxes in social data. In: Proceedings of the Eleventh ACM International Conference on Web Search and Data Mining, pp 19–27
Allerhand L, Youngmann B, Yom-Tov E, Arkadir D (2018) Detecting Parkinson’s disease from interactions with a search engine: Is expert knowledge sufficient? In: Proceedings of the 27th ACM International Conference on Information and Knowledge Management, pp 1539–1542
de Andrade N (2012) Oblivion: The right to be different from oneself-reproposing the right to be forgotten. In: Cerrillo Martínez A, Peguera Poch M, Peña López I, Vilasau Solana M (eds) VII international conference on internet, law & politics. Net neutrality and other challenges for the future of the Internet, IDP. Revista de Internet, Derecho y Política, 13, pp 122–137
Ausloos J (2020) The right to erasure in EU data protection law. Oxford University Press, Oxford
Automobile Insurance Rate Board (2022) Technical guidance: Change in rates and rating programs. Albera AIRB
Avraham R, Logue KD, Schwarcz D (2013) Understanding insurance antidiscrimination law. South California Law Rev 87:195
Backer DC (2017) Risk profiling in the auto insurance industry. Gracey-Backer, Inc Blog March 14
Bagdasaryan E, Poursaeed O, Shmatikov V (2019) Differential privacy has disparate impact on model accuracy. Adv Neural Inf Process Syst 32:15479–15488
Bailey RA, Simon LJ (1960) Two studies in automobile insurance ratemaking. ASTIN Bull J IAA 1(4):192–217
Banham R (2015) Price optimization or price discrimination? regulators weigh in. Carrier Management May 17
Barbosa JJR (2019) The business opportunities of implementing wearable based products in the health and life insurance industries. PhD thesis, Universidade Católica Portuguesa
Bath C, Edgar K (2010) Time is money: Financial responsibility after prison. Prison Reform Trust, London
Beckett L (2014) Everything we know about what data brokers know about you. ProPublica June 13
Bickel PJ, Hammel EA, O’Connell JW (1975) Sex bias in graduate admissions: Data from berkeley. Science 187(4175):398–404
Bigot R, Cocteau-Senn D, Arthur C (2019) La protection des données personnelles en assurance : dialogue du juriste avec l’actuaire. In: Netter E (ed) Regards sur le nouveau droit des données personnelles, CEPRISCA, collection Colloques
Bouk D (2022) Democracy’s data: the hidden stories in the U.S. census and how to read them. MCD
Brown RL, Charters D, Gunz S, Haddow N (2007) Colliding interests–age as an automobile insurance rating variable: Equitable rate-making or unfair discrimination? J Bus Ethics 72(2):103–114
Buolamwini J, Gebru T (2018) Gender shades: Intersectional accuracy disparities in commercial gender classification. In: Conference on Fairness, Accountability and Transparency, Proceedings of Machine Learning Research, pp 77–91
Butler P, Butler T (1989) Driver record: A political red herring that reveals the basic flaw in automobile insurance pricing. J Insurance Regulat 8(2):200–234
Calders T, Žliobaite I (2013) Why unbiased computational processes can lead to discriminative decision procedures. In: Discrimination and privacy in the information society, pp 43–57. Springer, New York
Carnis L, Lassarre S (2019) Politique et management de la sécurité routière. In: Laurent C, Catherine G, Marie-Line G (eds) La sécurité routière en France, Quand la recherche fait son bilan et trace des perspectives, L’Harmattan
Chakraborty S, Raghavan KR, Johnson MP, Srivastava MB (2013) A framework for context-aware privacy of sensor data on mobile systems. In: Proceedings of the 14th Workshop on Mobile Computing Systems and Applications, Association for Computing Machinery, HotMobile ’13
Charpentier A, Flachaire E, Ly A (2018) Econometrics and machine learning. Economie et Statistique 505(1):147–169
Chen Y, Liu Y, Zhang M, Ma S (2017) User satisfaction prediction with mouse movement information in heterogeneous search environment. IEEE Trans Knowl Data Eng 29(11):2470–2483
Christensen CM, Dillon K, Hall T, Duncan DS (2016) Competing against luck: The story of innovation and customer choice. Harper Business, New York
Cohen JE (1986) An uncertainty principle in demography and the unisex issue. Am Stat 40(1):32–39
Collins E (2018) Punishing risk. Georgetown Law J 107:57
Constine J (2017) Facebook rolls out AI to detect suicidal posts before they’re reported. Techcrunch November 27
Coutts S (2016) Anti-choice groups use smartphone surveillance to target ‘abortion-minded women’during clinic visits. Rewire News Group May 25
Cummins JD, Smith BD, Vance RN, Vanderhel J (2013) Risk classification in life insurance, vol 1. Springer Science & Business Media, New York
Dalenius T (1977) Towards a methodology for statistical disclosure control. statistik Tidskrift 15(429–444):2–1
Davidson R, MacKinnon JG, et al. (2004) Econometric theory and methods, vol 5. Oxford University Press, New York
Davis GA (2004) Possible aggregation biases in road safety research and a mechanism approach to accident modeling. Accident Anal Prevent 36(6):1119–1127
Debet A (2007) Mesure de la diversité et protection des données personnelles. Commission Nationale de l’Informatique et des Libertés 16/05/2007 08:40 DECO/IRC
Delaporte P (1962) Sur l’efficacité des critères de tarification de l’assurance contre les accidents d’automobiles. ASTIN Bull J IAA 2(1):84–95
Delaporte PJ (1965) Tarification du risque individuel d’accidents d’automobiles par la prime modelée sur le risque. ASTIN Bull J IAA 3(3):251–271
Depoid P (1967) Applications de la statistique aux assurances accidents et dommages: cours professé à l’Institut de statistique de l’Université de Paris. 2e édition revue et augmentée\(\ldots \) Berger-Levrault
Desrosières A (1998) The politics of large numbers: A history of statistical reasoning. Harvard University Press, Harvard
Dilley S, Greenwood G (2017) Abandoned 999 calls to police more than double. BBC 19 September 2017
Dressel J, Farid H (2018) The accuracy, fairness, and limits of predicting recidivism. Sci Adv 4(1):eaao5580
Duhigg C (2019) How companies learn your secrets. The New York Times 02-16-2019
Dumas A, Allodji R, Fresneau B, Valteau-Couanet D, El-Fayech C, Pacquement H, Laprie A, Nguyen TD, Bondiau PY, Diallo I, et al. (2017) The right to be forgotten: a change in access to insurance and loans after childhood cancer? J Cancer Survivorship 11:431–437
Dwoskin E (2018) Facebook is rating the trustworthiness of its users on a scale from zero to one. Washington Post 21-08
Eidinger E, Enbar R, Hassner T (2014) Age and gender estimation of unfiltered faces. IEEE Trans Inf Forens Secur 9(12):2170–2179
European Commission (1995) Directive 95/46/ec of the european parliament and of the council of 24 october 1995 on the protection of individuals with regard to the processing of personal data and on the free movement of such data. Official J Eur Communit 38(281):31–50
Finger RJ (2006) Risk classification. In: Bass I, Basson S, Bashline D, Chanzit L, Gillam W, Lotkowski E (eds) Foundations of Casualty Actuarial Science, Casualty Actuarial Society, pp 287–341
Flanagan T (1985) Insurance, human rights, and equality rights in canada: When is discrimination “reasonable?”. Canad J Polit Sci/Revue canadienne de science politique 18(4):715–737
Freedman DA (1999) Ecological inference and the ecological fallacy. Int Encyclopedia Soc Behav Sci 6(4027-4030):1–7
Friedman S, Canaan M (2014) Overcoming speed bumps on the road to telematics. In: Challenges and opportunities facing auto insurers with and without usage-based programs, Deloitte
Frisch R, Waugh FV (1933) Partial time regressions as compared with individual trends. Econometrica, 387–401
Gambs S, Killijian MO, del Prado Cortez MNn (2010) Show me how you move and i will tell you who you are. In: Proceedings of the 3rd ACM International Workshop on Security and Privacy in GIS and LBS
Garg N, Schiebinger L, Jurafsky D, Zou J (2018) Word embeddings quantify 100 years of gender and ethnic stereotypes. Proc Natl Acad Sci 115(16):E3635–E3644
Gelman A (2009) Red state, blue state, rich state, poor state: Why Americans vote the way they do. Princeton University Press, Princeton
Giles C (2020) Goodhart’s law comes back to haunt the uk’s covid strategy. Financial Times 14-5
Hand DJ (2020) Dark data: why what you don’t know matters. Princeton University Press, Princeton
Harcourt BE (2015a) Exposed: Desire and disobedience in the digital age. Harvard University Press, Harvard
Heidorn PB (2008) Shedding light on the dark data in the long tail of science. Library Trends 57(2):280–299
Henley A (2014) Abolishing the stigma of punishments served: Andrew henley argues that those who have been punished should be free from future discrimination. Criminal Justice Matters 97(1):22–23
Hill K (2022) A dad took photos of his naked toddler for the doctor. Google flagged him as a criminal. The New York Times August 25
Hooker S, Moorosi N, Clark G, Bengio S, Denton E (2020) Characterising bias in compressed models. arXiv 2010.03058
Insurance Bureau of Canada (2021) Facts of the property and casualty insurance industry in Canada. Insurance Bureau of Canada
Iten R, Wagner J, Zeier Röschmann A (2021) On the identification, evaluation and treatment of risks in smart homes: A systematic literature review. Risks 9(6):113
Jarvis B, Pearlman RF, Walsh SM, Schantz DA, Gertz S, Hale-Pletka AM (2019) Insurance rate optimization through driver behavior monitoring. Google Patents 10,169,822
Jones EE, Nisbett RE (1971) The actor and the observer: Divergent perceptions of the causes of behavior. General Learning Press, New York
Jones ML (2016) Ctrl + Z: The Right to Be Forgotten. New York University Press, New York
Karapiperis D, Birnbaum B, Brandenburg A, Castagna S, Greenberg A, Harbage R, Obersteadt A (2015) Usage-based insurance and vehicle telematics: insurance market and regulatory implications. CIPR Study Ser 1:1–79
Keffer R (1929) An experience rating formula. Trans Actuarial Soc Am 30:130–139
Kelly H (2021) A priest’s phone location data outed his private life. it could happen to anyone. The Washinghton Post 22-07-2021
Kelly M, Nielson N (2006) Age as a variable in insurance pricing and risk classification. Geneva Papers Risk Insurance Issues Pract 31(2):212–232
Keyfitz K, Flieger W, et al. (1968) World population: an analysis of vital data. The University of Chicago Press, Chicago
King G, Tanner MA, Rosen O (2004) Ecological inference: New methodological strategies. Cambridge University Press, Cambridge
Kiviat B (2021) Which data fairly differentiate? American views on the use of personal data in two market settings. Sociol Sci 8:26–47
Lancaster R, Ward R (2002) The contribution of individual factors to driving behaviour: Implications for managing work-related road safety. HM Stationery Office
Lauer J (2017) Creditworthy: a history of consumer surveillance and financial identity in America. Columbia University Press, New York
Laulom S (2012) Égalité des sexes et primes d’assurances. Semaine Sociale Lamy 1531:44–49
Lemaire J (1985) Automobile insurance: actuarial models, vol 4. Springer Science & Business Media, New York
Lemaire J, Park SC, Wang KC (2016) The use of annual mileage as a rating variable. ASTIN Bull J IAA 46(1):39–69
Lovejoy B (2021) Linkedin breach reportedly exposes data of 92% of users, including inferred salaries. 9to5mac 06/29
Mangel M, Samaniego FJ (1984) Abraham wald’s work on aircraft survivability. J Am Stat Assoc 79(386):259–267
Mantelero A (2013) The eu proposal for a general data protection regulation and the roots of the ‘right to be forgotten’. Comput Law Secur Rev 29(3):229–235
Mayer J, Mutchler P, Mitchell JC (2016) Evaluating the privacy properties of telephone metadata. Proc Natl Acad Sci 113(20):5536–5541
Mbungo R (2014) L’approche juridique internationale du phénomène de discrimination fondée sur le motif des antécédents judiciaires. Revue québécoise de droit international 27(2):59–97
Merrill D (2012) New credit scores in a new world: Serving the underbanked. TEDxNewWallStreet
Meyers G, Van Hoyweghen I (2018) Enacting actuarial fairness in insurance: From fair discrimination to behaviour-based fairness. Sci Culture 27(4):413–438
Miracle JM (2016) De-anonymization attack anatomy and analysis of ohio nursing workforce data anonymization. PhD thesis, Wright State University
Morrison EJ (1996) Insurance discrimination against battered women: Proposed legislative protections. Ind LJ 72:259
Nakashima R (2018) Google tracks your movements, like it or not. Associated Press August 14
Noguéro D (2010) Sélection des risques. discrimination, assurance et protection des personnes vulnérables. Revue générale du droit des assurances 3:633–663
O’Neil C (2016) Weapons of math destruction: How big data increases inequality and threatens democracy. Crown
Pager D (2003) The mark of a criminal record. Am J Sociol 108(5):937–975
Pager D (2008) Marked: Race, crime, and finding work in an era of mass incarceration. University of Chicago Press, Chicago
Parléani G (2012) Commentaire des lignes directrices de la commission européenne sur les suites de l’arrêt “test achats”. Revue générale du droit des assurances 3:563
Poku M (2016) Campbell’s law: implications for health care. J Health Serv Res Policy 21(2):137–139
Pope DG, Sydnor JR (2011) Implementing anti-discrimination policies in statistical profiling models. Am Econ J Econ Policy 3(3):206–31
Pradier PC (2011) (petite) histoire de la discrimination (dans les assurances). Risques 87:51–57
Prince AE, Schwarcz D (2019) Proxy discrimination in the age of artificial intelligence and big data. Iowa Law Rev 105:1257
Ribeiro MT, Singh S, Guestrin C (2016) “why should i trust you?” explaining the predictions of any classifier. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp 1135–1144
Robinson WS (1950) Ecological correlations and the behavior of individuals. Am Sociol Rev 15(3):351–357
Rosen J (2011) The right to be forgotten. Stan L Rev Online 64:88
Rubinow I (1936) State pool plans and merit rating. Law Contemp Probl 3(1):65–88
Sanche F, Roberge I (2023) La question de la semaine sur le casier judiciaire et les assurances. Radio Canada January 31
Sandel MJ (2020) The tyranny of merit: What’s become of the common good? Penguin, UK
Schneier B (2015) Data and Goliath: The hidden battles to collect your data and control your world. WW Norton & Company, New York
Scism L, Maremont M (2010a) Inside deloitte’s life-insurance assessment technology. Wall Street Journal November 19
Scism L, Maremont M (2010b) Insurers test data profiles to identify risky clients. Wall Street Journal November 19
Seelye KQ (1994) Insurability for battered women. New York Times May 12
Speicher T, Ali M, Venkatadri G, Ribeiro FN, Arvanitakis G, Benevenuto F, Gummadi KP, Loiseau P, Mislove A (2018) Potential for discrimination in online targeted advertising. In: Conference on Fairness, Accountability and Transparency, Proceedings of Machine Learning Research, pp 5–19
Spender A, Bullen C, Altmann-Richer L, Cripps J, Duffy R, Falkous C, Farrell M, Horn T, Wigzell J, Yeap W (2019) Wearables and the internet of things: Considerations for the life and health insurance industry. Brit Actuarial J 24:e22
Stein A (1994) Will health care reform protect victims of abuse-treating domestic violence as a public health issue. Human Rights 21:16
Stevenson M (2018) Assessing risk assessment in action. Minnesota Law Rev 103:303
Suresh H, Guttag JV (2019) A framework for understanding sources of harm throughout the machine learning life cycle. arXiv 1901.10002
Szalavitz M (2017) Why do we think poor people are poor because of their own bad choices. The Guardian July 5
Taylor A, Sadowski J (2015) How companies turn your Facebook activity into a credit score. The Nation May 27
The Zebra (2022) Car insurance rating factors by state. https://www.thezebra.com/
Tufekci Z (2018) Facebook’s surveillance machine. New York Times 19:1
Van Deemter K (2010) Not exactly: In praise of vagueness. Oxford University Press, Oxford
Van Schaack D (1926) The part of the casualty insurance company in accident prevention. Ann Am Acad Polit Soc Sci 123(1):36–40
Wachter S, Mittelstadt B (2019) A right to reasonable inferences: re-thinking data protection law in the age of big data and ai. Columbia Bus Law Rev, 494
Westreich D (2012) Berkson’s bias, selection bias, and missing data. Epidemiology 23(1):159
White RW, Doraiswamy PM, Horvitz E (2018) Detecting neurodegenerative disorders from web search signals. NPJ Digit Med 1(1):8
Wikipedia (2023) Data. Wikipedia, The Free Encyclopedia
Wilcox C (1937) Merit rating in state unemployment compensation laws. Am Econ Rev, 253–259
Yeung K (2018a) Algorithmic regulation: a critical interrogation. Regulation Governance 12(4):505–523
Yeung K (2018b) A study of the implications of advanced digital technologies (including AI systems) for the concept of responsibility within a human rights framework. MSI-AUT (2018) 5
Žliobaite I, Custers B (2016) Using sensitive personal data may be necessary for avoiding discrimination in data-driven decision models. Artif Intell Law 24(2):183–201
Zuboff S (2019) The age of surveillance capitalism: The fight for a human future at the new frontier of power. Public Affairs, New York
Author information
Authors and Affiliations
Rights and permissions
Copyright information
© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this chapter
Cite this chapter
Charpentier, A. (2024). What Data?. In: Insurance, Biases, Discrimination and Fairness. Springer Actuarial. Springer, Cham. https://doi.org/10.1007/978-3-031-49783-4_5
Download citation
DOI: https://doi.org/10.1007/978-3-031-49783-4_5
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-49782-7
Online ISBN: 978-3-031-49783-4
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)