1 Introduction

Personalization has always been a key factor in health insurance product provision. Traditionally, it involves a screening based on static information from the customer: their medical record and questionnaires they answer. But their medical history, enumerated by clinical data, can be scarce, and it certainly is just one determinant of health. The way people live their lives, enumerated by behavioral data, is the second determinant, and for risk assessment of chronic, age-related conditions of now seemingly healthy individuals, it is the most important, as indicated by several studies. A study on diabetes prevention [1] gives evidence to the importance of lifestyle for the outcomes in youths and adults. Another study [2] correlates health responsibility, physical activity, and stress management to obesity, a major risk factor for cardiovascular diseases, type 2 diabetes, and some forms of cancer. The 2017 Global Burden of Disease Study [3] considers behavioral, environmental, occupational, and metabolic risk factors.

Risk assessment has always been an integral part of the insurance industry [4]. Unlike risk assessment in medicine that is based on continuous estimation of risk factors, its insurance counterpart is usually static, done at the beginning of a contract with a client. Dynamic personalized products are only recently appearing as data-based digital risk assessment platforms. Such platforms start transforming insurance by disrupting the ways premiums are calculated [5] and are already being utilized in car insurance. In the scope of car insurance, continuous vehicle-based risk assessment [6] is already considered important for optimizing insurance premiums and providing personalized services to drivers. Specifically, driver behavior is analyzed [7] for usage-based insurance. Moreover, telematic driving profile classification [8] has facilitated pricing innovations in German car insurance.

Similarly, personalization of health insurance products needs to be based on continuous risk assessment of the individual, since lifestyle and behavior cannot be assessed at one instance in time; they involve people’s habits and their continuous change. Health insurance products employing continuous assessment of customers’ lifestyle and behavior are dynamically personalized.

Behavioral assessments, much like their clinical counterparts, rely on data. For behavior, the data collection needs to be continuous, facilitated by software tools for the collection of information capturing the important aspects of lifestyle and behavior. In the INFINITECH project [9], insurance experts define the data to be collected, and the Healthentia eClinical system [10] facilitates the collection. Specifically, Healthentia provider interfaces for data collection from medical and consumer devices, including IoT (Internet of Things) devices. Moreover, continuous risk assessment services are provided to health insurance professionals by training machine learning (ML) prediction models for the required health parameters. ML has been used in the insurance industry to analyze insurance claim data [11, 12]. Vehicle insurance coverage affects driving behavior and hence insurance claims [13]. These previous works employed ML to analyze data at the end of the insurance pathway, after the event. Instead, in this chapter, we follow the approach in [14]. We expand on the results presented therein, focusing on the continuous analysis of data at the customer side to personalize the health insurance product by modifying the insurance pathway.

Personalized dynamic product offerings benefit both the insurance companies and their customers, but the continuous assessment imposes a burden on the customers. Insurance companies gain competitive advantages with lower prices for low-risk customers. The customers have a direct financial benefit in the form of reduced premiums due to the lower risk of their healthy behavior. They also have an indirect benefit stemming from coaching about aspects of their lifestyle, both those that drive the risk assessment models toward positive decisions and those driving them toward negative decisions. The identification of these aspects is made possible by explainable AI techniques applied on the individual model decisions. The insurance companies need to balance the increased burden of the monitoring with the added financial and health benefits of using such a system.

The system for personalized health insurance products devised in the INFINITECH project is presented in Sect. 2 of this chapter. Then, its main components are detailed, covering the data collection (Sect. 3), the model training test bed (Sect. 4), and the provided ML services of risk assessment and lifestyle coaching (Sect. 5). Finally, the conclusions are drawn in Sect. 6.

2 INFINITECH Healthcare Insurance System

The healthcare insurance pilot of INFINITECH focuses on health insurance and risk analysis by developing two AI-powered services:

  1. 1.

    The risk assessment service allows the insurance company to adapt prices by classifying individuals according to their lifestyle.

  2. 2.

    The coach service advises individuals in their lifestyle choices, aiming at improving their health but also in persuading them to use the system correctly.

These two services rely on a model of health outlook trained on the collected data and used in the provision of the services.

An overview of pilot system for healthcare insurance is given in Fig. 16.1. It comprises two systems, the pilot test bed, built within the INFINITECH project, and the Healthentia eClinical platform, provided by Innovation Sprint. The data are collected by Healthentia, as detailed in Sect. 3. Toward this, the complete Healthentia eClinical platform is also presented in the same section. The pilot test bed facilitates secure and privacy-preserving model training as discussed in Sect. 4. The trained model is utilized for the risk assessment and the lifestyle coach ML services detailed in Sect. 5, and the results are finally visualized by the dashboards of the Healthentia portal web app.

Fig. 16.1
figure 1

INFINITECH healthcare insurance system comprising Healthentia and the respective test bed

3 Data Collection

Two types of data are collected to train the health outlook model: measurements and user reports. The measurements are values collected by sensors, which are automatically reported by these sensors to the data collection system, without the intervention of the user. They are objective data, since their quality only depends on the devices’ measurement accuracy. They have to do with physical activity, the heart, and sleep. The physical activity measurements involve steps, distance, elevation, energy consumption, and time spent in three different zones of activity intensity (light, moderate, and intense). The heart measurements include the resting heart rate and the time spent in different zones of heart activity (fat burn, cardio, and peak). The sleep measurements include the time to bed and waking up time, so indirectly the sleep duration and the time spent in the different sleep stages (light, REM, and deep sleep).

The reports are self-assessments of the individuals; hence, they are subjective data. They cover common symptoms, nutrition, mood, and quality of life. The symptoms are systolic and diastolic blood pressure and body temperature (entered as numbers measured by the users), as well as cough, diarrhea, fatigue, headache, and pain (where the user provides a five-level self-assessment of severity from not at all up to very much). Regarding nutrition, the user enters the number of meals and whether they contain meat, as well as the consumption of liquids: water, coffee, tea, beverages, and spirits. Mood is a five-level self-assessment of the user’s psychological condition, from very positive to neutral and down to very negative. Finally, quality of life [15] is reported on a weekly basis using the Positive Health questionnaire [16] and on a monthly basis using the EuroQol EQ-5D-5L questionnaire [17], which asks the user to assess their status in five degrees using five levels. The degrees are mobility, self-care, usual activities, pain/discomfort, and anxiety/depression, complemented with the overall numeric health self-assessment.

The data collection is facilitated by Healthentia [10]. The platform provides secure, persistent data storage and role-based, GDPR-compliant access. It collects the data from the mobile applications of all users, facilitating smart services, such as risk assessment, and providing both original and processed information to the mobile and portal applications for visualization. The high-level architecture of the platform is shown in Fig. 16.2. The service layer implements the necessary functionalities of the web portal and the mobile application. The study layer facilitates study management, organizing therein healthcare professionals, participants, and their data. They can be formal clinical studies or informal ones managed by pharmaceutical companies, hospitals, research centers, or, in this case, insurance companies.

Fig. 16.2
figure 2

Healthentia high-level architecture

The Healthentia core layer comprises of high-order functionalities on top of the data, like role-based control, participant management, participants’ report management, and ML functionalities. The low-level operations on the data are hosted in the data management layer. Finally, the API layer provides the means to expose all the functionalities of the layers above to the outside world. Data exporting toward the pilot test bed and model importing from the test bed are facilitated by it.

The Healthentia mobile application (Fig. 16.3) enables data collection. Measurements are obtained from IoT devices, third-party mobile services, or a proprietary sensing service. User reports are obtained via answering questionnaires that either are regularly pushed to the users’ phones or are accessed on demand by the users themselves. Both the measured and reported data are displayed to the users, together with any insights offered by the smart services of the platform.

Fig. 16.3
figure 3

Healthentia mobile application

The Healthentia portal application (Fig. 16.4) targets the health insurance professionals. It provides an overview of the users of each insurance organization and details for each user. Both overview and details include analytics based on the collected data and the risk assessment insights. It also facilitates managing the organization, providing, for example, a questionnaire management system to determine the types of self-assessments and reports provided by the users.

Fig. 16.4
figure 4

Healthentia portal application – viewing measurements and creating questionnaires

4 INFINITECH Healthcare Insurance Pilot Test Bed

The INFINITECH healthcare insurance test bed facilitates model training by providing the necessary regulatory compliance tools and the hardware to run the model training scripts, whenever new models are to be trained. Its high-level architecture is shown in the upper part of Fig. 16.1. The test bed ingests data from Healthentia, processes it for model training, and offers the means to perform the model training. It then provides the models back to Healthentia for online usage in risk assessment.

The regulatory compliance tools provide the data in the form compliant for model training. The tools comprise the Data Protection Orchestrator (DPO) [18, 19], which among others regulates data ingestion, and the anonymizer, which are presented in detail in Chap. 20.

The INFINITECH data collection module [20] is responsible for ingesting the data from Healthentia utilizing the Healthentia API when so instructed by the DPO. It then provides data cleanup services, before handling the data to the INFINITECH anonymizer [21, 22]. The ingested data are already pseudo-anonymized, as only non-identifiable identifiers are used to designate the individuals providing the data, but the tool performs anonymization of the data itself. The anonymized data are stored in LeanXcale [23], the test bed’s database. Different anonymized versions are to be stored, varying the effect of anonymization, aiming at determining its effect on the trained model quality. The model training is an offline process. Hence, the ML engineers responsible for model training will be instructing the DPO to orchestrate data ingestion at different anonymization levels. Models based on logistic regression [24], random forest [25], and (deep) neural networks [26] are trained to predict the self-reported health outlook variation on a weekly basis. Binary and tristate models have been trained using data collected in the data collection phase of healthcare insurance pilot, involving 29 individuals over periods of time spanning 3–15 months. The classification accuracy of the random forest models is best in this context, since the dataset is limited for neural networks of some depth. They are shown in Fig. 16.5.

Fig. 16.5
figure 5

Classification rate of random forest binary (improve, worsen) and tristate (improve, same, worsen) models for different number of estimators

Shapley additive explanations (SHAP) analysis [27] is employed to establish the impact of the different feature vector elements in the classifier decisions (either positive or negative). Average overall decisions, this gives the importance of the different attributes for the task at hand. Attributes of negligible impact on decisions can be removed, and the models can be retrained on a feature space of lower dimensions. Most importantly though, SHAP analysis is used in the virtual coaching service as discussed in Sect. 5.2.

These models are not the final ones. During the pilot validation phase that will start in September 2021, 150 individuals will be using the system for 12 months. Approximately two-thirds of these participants will be used to keep on retraining the models.

5 ML Services

Two services are built using the models trained in Sect. 4. The model decisions accumulated across time per individual are used in risk assessment (see Sect. 5.1), while the SHAP analysis of the individual decisions is used in virtual coaching (see Sect. 5.2).

Since the actual data collected thus far are barely enough for training the models, a synthetic dataset is generated to evaluate the performance of both services. Five behavioral phenotypes are defined in the simulator of [14]. Two are extreme ones: at one end, the athletic phenotype that likes indoors and outdoors exercising and enjoys a good night’s sleep and, at the other end, the gamer who is all about entertainment, mainly indoors, enjoys work, and is not too keen on sleeping on time. In between lies the balanced phenotype, with all behavioral traits being more or less of equal importance as they are allowed a small variance around the average. Two random variants of the balanced phenotype are also created, with the behavioral traits allowed quite some variance from the balanced state. The one random phenotype is associated with excellent health status, while the other one is associated with a typical health status. 200 individuals of each phenotype are simulated for a duration of 2 years.

5.1 Personalized Risk Assessment

Personalized risk assessment is based on the decisions of the models for each individual. The assessments are long term in the sense that they take into account all the model decisions over time intervals that are very long. In this study, we calculate the long-term averages of the different daily decisions with a memory length corresponding roughly to half a year. There are two such averaged outputs for models with binary decisions and three for those with tristate ones. In every case, the averages are run for the whole length of the synthetic dataset (two years), and for each day of decision, they sum to unity. At any day, the outlook is assessed as the sum of all the averaged positive outcomes from the beginning of the dataset up to the date of the assessment, minus the sum of the negative ones in the same time interval. In the tristate case, the difference is normalized by the sum of the constant ones. The resulting grade is multiplied by 100 and thus can be in the range of [−100,100]. Obviously, outlook grades larger than zero correspond to people whose well-being outlook has been mostly positive in the observation period, and outlook grades smaller than zero correspond to people whose well-being outlook has been mostly negative in the observation period.

The daily evolutions of the accumulated model decisions for the 2 years of observation for the first athletic, balanced, and gamer simulated person are shown in Fig 16.6. Clearly, the athletic person is doing great, and the balanced one is doing quite good. The gamer is not worsening but looks rather stagnant. The histograms of the outlook grades after 2 years are shown in Fig. 16.7 for each of the five behavioral phenotypes. It should be no surprise that the two extreme phenotypes are at opposite sides of the outlook spectrum, clearly separated by a range of values occupied by the balanced and random phenotypes. It is the actual activities done that determine the outlook grade, and in the balanced and random phenotypes, the selection of activities is quite different within each phenotype, so they exhibit quite a lot of spread in the outlook, as expected in real life.

Fig. 16.6
figure 6

Averaged outputs of the tristate random forest classifier with 2048 trees for the first athletic person (top) with an outlook grade after 2 years of 46.2, for the first balanced person (middle) with an outlook grade of 15.2, and for the first gamer (bottom) with an outlook grade of 2.9

Fig. 16.7
figure 7

Histograms of outlook grades for the five behavioral types. From athletic to gamer, the bulk of the grades move toward smaller values

5.2 Personalized Coaching

The SHAP analysis results for the individual decisions are shown in Fig. 16.8. Each row corresponds to a lifestyle attribute, and each dot in a specific row corresponds to the value of that element in one of the input daily vectors. The color of the dot indicates the element’s value (from small values in blue to large values in red). The placement of the dot on the horizontal axis corresponds to the SHAP value. Values close to zero correspond to lifestyle attributes with negligible effect on the decision, while large positive or negative values correspond to lifestyle attributes with large effects. The vertical displacement indicates how many feature vectors fall into the particular range of SHAP values. Thus, thick dot cloud areas correspond to many input daily vectors in that range of SHAP values. Dots on the left correspond to attribute values that direct one toward a prediction that health is improving, while dots on the right suggest a worsening of health. For example, red dots of large values of the body mass index trend (increasing weight) are on the right indicating negative health outlook. Purple dots of moderate body mass index trend are around zero, indicating negligible effect on the decisions. Finally, blue dots indicating a trend to lose weight are on the left, indicating improved health outlook.

Fig. 16.8
figure 8

SHAP analysis of the individual decisions, signaling the importance of small, medium, and large feature values in the final decision reached for this vector

The individual SHAP coefficients per model decision are employed to establish per person importance of lifestyle attributes in positive or negative well-being outlook. The most influential lifestyle attributes for a positive outlook are collected over any short time interval, as are those with the largest positive SHAP coefficients. Similarly, the most influential negative attributes are obtained for the same interval. Then, the individual is coached about these positive and negative attributes. The personalized coach offers advice toward behaviors of the positive attributes and away from those of the negative attributes.

It is worth mentioning that the explainable AI technique for personalized coaching discussed here is only about the content of the advice. An actual virtual coach should also involve decisions on the timing, the modality, and the tone of the messages carrying the advice to the individuals.

6 Conclusions

The INFINITECH way of delivering personalized services for health insurance is discussed in this chapter. To that extent, the healthcare insurance pilot of the project is integrating Healthentia, an eClinical platform for collecting real-world data from individuals into a test bed for training classification models. The resulting models are used in providing risk assessment and personalized coaching services. The predictive capabilities of the models are acceptable, and their use in the services to analyze the simulated behavior of individuals is promising. The actual validation of the provided services is to be carried out in a pilot study with 150 individuals, which started in September 2021 and will last in 1 year.

Our future work on this usage-based healthcare insurance pilot will not be confined to validating the technical implementation of the pilot system, including its data collection and machine learning-based analytics parts. We will also explore how such usage-based systems can enable new business models and healthcare insurance service offerings. For instance, we will study possible pricing models and related incentives that could make usage-based insurance more attractive than conventional insurance products to the majority of consumers.