Keywords

1 Introduction

Connected objects are changing the world and the way organizations do business. Cutting-edge technologies in several disciplines, such as IoT, 5G, and AI, are paving the way to new business services. Thanks to these disparate technologies, advanced services, unimaginable previously, can be provided to the customers, bringing benefits to both companies and their customers. Present chapter introduces the INFINITECH Pilot 11, which exploits these concepts and technologies from the point of view of the car insurance companies and insured clients, the drivers. Although the idea of classifying a driver by reading technical data provided by their own vehicle is not really new, this pilot goes a couple of steps beyond: first by developing an IoT standards-based platform to homogenize and manage real-time data captured from vehicles while, at the same time, merging this with other relevant sources, like weather or traffic incidents, and, next, by using this aggregated information to evaluate cutting-edge ML/DL technologies and develop an accurate AI-powered model to infer different driving profiles. The final objective is to use this driving profiling model to analyze the insured’s driving behavior and tailor their offered services, while it helps insurance companies to better estimate the real-time associated risks.

In this line, this chapter presents a connected car as an IoT infrastructure itself and provides an overview of the usage-based automotive insurance solutions being developed within the INFINITECH project. This chapter is divided into six sections beyond this introduction: Section 2 “Insurance Premiums for Insured Clients” shows the current status in the automotive insurance sector. A general review of the different traditional methods to calculate automobile insurance premiums is also provided. Section 3 “Connected Vehicles’ Infrastructures” provides an overview of the supported technologies by current vehicles to get connected and incoming scenarios to exploit these new connected vehicles (V2X infrastructures, autonomous vehicles, etc.). Section 4 “Data Gathering and Homogenization Process” describes the INFINITECH technologies used to collect the required data sources and the process undertaken to homogenize and standardize the data using the FIWARE NGSI standard for Context Information Management [1]. Section 5 “Driving Profile Model” explains the steps to develop the AI model to assist on route clustering. Section 6 “Customized Car Insurance Services Powered by ML” introduces the services that exploit this new created AI model and the new way to estimate the risks. Finally, Sect. 7, the concluding section of the chapter, gives a summary and critique of the findings. It also presents the found and envisioned challenges.

2 Insurance Premiums for Insured Clients

Motor Third Party Liability Insurance (MTPL) ensures that damage to third-party health and property caused by an accident for which the driver and/or the owner of the vehicle were responsible is covered. It was understood from the very beginning of the vehicle’s circulation that this kind of insurance had to be compulsory due to the lack of solvency of first party who caused bodily injury or property damage following any event related to a car accident. For instance, in the United Kingdom, MTPL insurance has been compulsory since 1930 with the Road Traffic Act. In our days, MTPL insurance is compulsory in all countries of European Union and in most of the countries worldwide with the limits of the cover varying from region to region. Apart from the compulsory MTPL insurance, own damage (OD) is an additional cover in a car insurance policy. OD helps you stay covered against damage caused to your vehicle due to accidents like fire, theft, etc. In case of an accident, an own damage cover compensates you for expense to repair or replace parts of your car damaged in the accident.

In order for the insurance companies to be in a position to pay the claims and make profit out of their business, they need to assess the risk involved in every insurance policy. Insurance actuaries help insurance companies in risk assessment. Then they use this analysis to design and price insurance policies. The higher the risk for a certain group, the more likely it is for insurance companies to pay out a claim, meaning these groups will be charged higher for their motor insurance. Risk assessment involves measuring the probability that something will happen to cause a loss. In few regions in the world, compulsory MTPL insurance premiums are determined by the government; in most of the world, insurance premiums are determined by insurance companies, depending on many factors that are believed to affect the expected cost of future claims. These factors are:

  • The driving record of the driver or the vehicle: the better the record, the lower the premium. If the insured had accidents or serious traffic violations, it is likely to pay more; accordingly, a new driver with no driving record pays more.

  • The driver and especially the gender (male drivers, especially younger ones, are on average regarded to drive more aggressively), the age (teenage drivers are involved in more accidents because of no driving experience so they are charged more, and senior-experienced drivers are charged less. However, after a certain age, the premiums rise again due to slower reflexes and reaction times), the marital status (married drivers average fewer accidents than single drivers), and the profession (certain professions are proven to result in more accidents especially if travelling with a car is involved).

  • The vehicle, focusing on specific aspects of it such as the cost of the car, likelihood of theft, the cost of repairs, its engine size, and the overall safety record of the vehicle. Vehicles with high-quality safety equipment may qualify for premium discounts. Insurance companies also take into consideration the damage a vehicle can inflict on other vehicles.

  • The location, meaning the address of the owner or the usual place of circulation; areas with high crime rates (vandalism, thefts) generally lead to higher premiums as well as areas with higher rates of accidents. So, urban drivers pay more for an insurance policy than those in smaller towns or rural areas. Other area-related pricing factors are cost and frequency of litigation, medical care and car repair costs, prevalence of auto insurance fraud, and weather trends.

  • And finally, the type and amount of motor insurance, the limits on basic motor insurance, the amount of deductibles if any, and the types and amounts of policy options.

This is the traditional way of pricing based on population-level statistics available to the insurance companies prior to the initial insurance policy. The contemporary approach in motor insurance is to consider the present patterns of driving behavior through usage-based insurance (UBI) schemes. There are three types of UBI:

  • The simplest form of UBI bases the insurance premiums on the distance driven by simply using the odometer reading of the vehicle. In 1986, Cents Per Mile Now introduced in the insurance market classified odometer-mile rates. The insureds buy prepaid miles of insurance protection according to their needs. Insurance automatically ends when the odometer limit is reached. The insureds must keep track of miles on their own to know when to buy more. In the event of a traffic stop or an accident, the officer can easily verify that the insurance is valid by comparing the figure on the insurance card and the odometer. Critics of this UBI scheme point out that there is a potential odometer tampering to cheat the system, but newer electronic odometers are difficult to tamper with, and definitely it is more expensive to do so because of the equipment needed, so apart from risky, its uneconomical in terms of motor insurance.

  • Another instance of UBI is based on distance driven by aggregating miles (or minutes of use) from installed GPS-based devices reporting the results via cellphones or RF technology. In 1998, Progressive insurance in Texas started a pilot program using this approach; although the pilot discontinued in 2000, many other products followed this approach, referred to as telematic insurance or black box insurance technology used for stolen vehicle and fleet tracking. Since 2010, GPS-based systems and telematic systems have become mainstream in motor insurance, and two new forms of UBI appeared: Pay As You Drive (PAYD) and Pay How You Drive (PHYD) insurance policies. In PAYD policies, insurance premiums are calculated from the distance driven, whereas in PHYD, the calculation is similar to PAYD, but also brings in additional sensors like accelerometer to monitor driving behavior. Since 2012, smartphone auto insurance policies are another type of GPS-based systems utilizing smartphones as a GPS sensor. Although this system lacks in reliability, it is used due to its availability as it only requires a smartphone that most of the insureds use and no other special equipment.

  • The last type of UBI is based on data collected directly from the vehicle with a device connected to a vehicle’s On-Board Diagnostic (OBD II) Port, transmitting speed, time of the day, and number of miles the vehicle is driven. Vehicles that are driven less often, in less risky ways, and at less risky times of the day can receive discounts and vice versa. This means drivers have a stronger incentive to adopt safer practices which may result in fewer accidents and safer roads for everyone. In more recent approaches, OBDII-based systems may record braking force, speed, and proximity to other cars, drunken drivers, or drivers using cellphones.

3 Connected Vehicles’ Infrastructures

In the recent years, the functionalities and services around connected and autonomous vehicles have grown in a vast manner, by exploiting the technical datasets these vehicles can provide. Some examples of these new services are:

  • Infotainment services such as HD video transmission, personal assistant, etc.

  • HD maps for autonomous navigation

  • Cooperative Intelligent Transport Systems (C-ITS) services [2] such as hazard warning for the drivers

  • Autonomous maneuvers such as overtaking, motorway lane merge, etc.

  • Algorithms such as pedestrian detection, parking slot mark detection, etc.

To implement these new features, these vehicles (and infrastructures) are demanding more and more bandwidth, based on the criticality of the latency and/or the potential large amount of data able to be transmitted. This fosters the development of new communication protocols and connection paradigms to support them, while new potential services grow around.

In order to gather information from the vehicles, their own protocols and systems have evolved based on the needs and the capabilities of their embedded devices and central processing units. These have been enhanced by IoT technologies and telecommunication systems that allow nowadays communication between different components and actors not only within the vehicle but also with other vehicles, infrastructure, and clouds.

Considering the new connected cars as a complete IoT infrastructure that feeds the Pilot 11 services, it relies on the following standards:

  • OBD (on-board diagnostics) refers to the standard for exposing different vehicle parameters that can be accessed and checked, in order to detect and to prevent malfunctions. The way and the variety of parameters that can be accessed through this connector have evolved since the 1980s, when this technology was introduced. OBD-I interfaces came up at first instance and intend to encourage manufacturers to control the emissions in a more efficient way. OBD-II provides more standardized information and system check protocols, creating enriched failure logs and supporting wireless access, via Wi-Fi or Bluetooth. Nowadays, the standards used for monitoring and control of the engine and other devices of the vehicles are OBD-2 (USA), EOBD (Europe), and JOBD (Japan).

  • CAN (Controller Area Network), born in the 2000s and evolved to support the CAN BUS protocol that broadcasts relevant parameter status and information to all the connected nodes within the vehicle. With a data rate up to 1 Mbps, it supports message priority, latency guarantee, multicast, error detection and signalization, and automatic retransmission of error frames. CAN is normally used in soft real-time systems, such as engines, power trains, chassis, battery management systems, etc. CAN FD (flexible data rate) fulfills the need of increasing the data transmission rate (up to 5 Mbps) and to support larger frames in order to take advantage of all the capabilities of the latest automotive ECUs (electronic control units).

  • MOST (Media Oriented Systems Transports) is another protocol based on bus data transfer oriented to the interconnection of multimedia components in vehicles. It was created in 1997, and the main difference with other bus standards for automotive is that it is based on optic fiber, which allows higher data rates (up to 150 Mbps). MOST is normally used in media-related applications and control in automotive.

  • FlexRay is considered more advanced than CAN regarding costs and features. The most significant are a high data rate (10 Mbps), redundancy, security, and fault tolerance. And while CAN nodes must be configured for a specific baud rate, FlexRay allows the combination of deterministic data that arrives in a predictable time frame, with dynamic (event-driven) data. FlexRay is normally used in hard real-time systems, such as powertrain or chassis.

  • Automotive Ethernet provides a different option. It allows multipoint connection, and it is oriented to provide a secure data transfer handling large amounts of data. The most relevant advantages for the automotive sector are higher bandwidth and low latency. Traditional Ethernet is too noisy and very sensitive to interference. Automotive Ethernet was raised to overcome these two issues.

3.1 Vehicle to Everything: V2X Communication Technologies

Taking into account the evolution of the autonomous and connected vehicle functionalities, a large part of the data accessed and gathered using the technologies presented above can be uploaded into an upper infrastructure (Cloud, Road Side Units, etc.) or shared with other vehicles with the aim of improving road safety and traffic efficiency. In turn, information from cloud infrastructures and from other vehicles can be received and exploited by a connected car. These scenarios are known as vehicle-to-infrastructure (V2I) or vehicle-to-vehicle (V2V) connections, both grouped into V2X acronyms, and will allow the data gathering process required to feed this pilot.

The V2X communication technologies can be divided into short- and long-range communications. Based on the requirements of the use case and the scenario deployed, one or another would be selected.

3.1.1 Dedicated Short-Range Communications: 802.11p

802.11p, also known in Europe as ITS-G5 [3], is the standard that supports direct communication for V2V and V2I. This specification defines the architecture, the protocol at network and transport level, the security layer, and the frequency allocation, as well as the message format, size, attributes, and headers. The messages defined by the standard are Decentralized Environmental Notification Messages (DENM), Cooperative Awareness Messages (CAM), Basic Safety Message (BSM), Signal Phase and Timing Message (SPAT), In Vehicle Information Message (IVI), and Service Request Message (SRM).

The ETSI standard also defines the quality of service, regarding the transmission and reception requirements. In this way, the protocol is very robust, and it allows it to transmit very low volume of data with very low latency (order of millisecond) which makes it very appropriate for the execution of autonomous vehicle use cases or road hazard warnings.

On the other hand, ITS-G5 communications are defined for short-range communications, directly between vehicles or between vehicles and the near infrastructure as RSUs (Road Side Units) which can cover an area of 400 m radio for interurban environments and 150 m in urban areas. In this way, the more covered is the area, the wider infrastructure required.

3.1.2 Long Range: Cellular Vehicle to Everything (C-V2X)

In recent years, cellular networks for V2X communications have gained importance as an alternative or complement to 802.11p/Wi-Fi. They are supported by many organizations such as the 5GAA (5G Automotive Association) which counts with dozens of global members, including principal carmakers. This radio technology also operates on the 5.9 GHz ITS spectrum and promises very low latency in the transmission reaching the 4 milliseconds or even less, based on the scenario and the network capabilities, as well as overcoming (when possible) known issues or barriers that 802.11p faces.

C-V2X latest release is designed to support and take advantage of 5G network capabilities, regarding speed and data rate transmission.

With regard to the semantic content of the messages, the standard does not include a specification, but it is proposed to use the ITS-G5 one (SPAT, CAM, DENM, etc.) to be sent over the C-V2X data transport layer.

3.1.3 CTAG’s Hybrid Modular Communication Unit

In order to communicate with the infrastructure and to publish the data gathered from the vehicle to the Cloud and the Road Side Units, as well as to other vehicles, the cars involved in this pilot need to install an On-Board Unit (OBU). This device is designed, manufactured, and installed by CTAG, and it’s known as its Hybrid Modular Communication Unit (HMCU) (Fig. 17.1). This unit implements the link between the vehicle’s internal network and the external cloud infrastructures, by supporting the main technologies and interfaces mentioned above: cellular channel (3G/4G/LTE/LTE-V2X/5G) and 802.11p (G5) channel, CAN channel, Automotive Ethernet, Wi-Fi (802.11n), and Ethernet 100/1000, allowing the possibility to modular different configuration for cellular, 802.11p, and D-GPS, according to specific needs.

Fig. 17.1
figure 1

CTAG’s HMCU that captures vehicle’s data and connects to the remote infrastructure

In this way, the HMCU allows the vehicle’s technical data compilation and its update on the corresponding remote infrastructure in real time, supporting novel infotainment and telematic services development, plus advanced driving (AD) and IoT services deployment for the different communication technologies.

For this specific pilot, the datasets from inside the vehicle are gathered by using the CAN channel and published to the CTAG Cloud by using the 4G/LTE channel.

To perform the proposed scenario and evaluate the components involved, CTAG has provided 20 vehicles equipped with its HMCU which will be reporting data during pilot’s duration, driving free mostly all along CTAG location and surroundings in Spain.

3.1.4 Context Information

Complementing the typical car data source, there are additional information sets which may have a relevant impact on the driving behavior. These available sources are mapped as context information, and the way the driver reacts within this context helps to define their corresponding driving profile. In this sense, the inclusion of these information sources in the driving profiling analysis and the AI models, plus its corresponding impact on the outcomes, represents one of the novelties included in this modeling. INFINITECH developed adaptors to capture this context information related to the driving area monitored in this pilot:

  • Weather information, including temperature, humidity, visibility, precipitations, wind speed, etc., captured from the Spanish State Meteorological Agency (AEMET) [4] and referenced to the area of Vigo [5].

  • Traffic alerts and events, which report traffic accidents, weather issues (fog, ice, heavy rain, etc.), or roadworks, providing location and duration. These are captured from the Spain’s Directorate General of Traffic (DGT) [6] and cover the whole Spanish region of Galicia [7].

  • From the OpenStreetMap’s [8] databases, the pilot imports information from roads, such as their location, shape, limits, lanes, maximum allowed speeds, etc. The IDs of these roads will be key to relate and correlate data from vehicles, traffic alerts, weather, and roads themselves.

4 Data Gathering and Homogenization Process

To manage the required information sources and support the pilot’s driving profiling AI model development, INFINITECH implements an open software infrastructure, known as the smart fleet framework, that integrates all these heterogeneous assets into standard and homogeneous datasets, so it can be served to the AI framework in a coherent and interrelated fashion.

The gathering and homogenization process is represented in Fig. 17.2 and is based on the FIWARE NGSI standard for Context Information Management [9], fostered by the European Commission and supported by the ETSI. This FIWARE approach provides both a common API to update and distribute information and common data models to map and reference each involved data source. According to this, we can consider two sets of components within the smart fleet framework that leads the data collection and standardization:

  • The data adaptors, also identified within the FIWARE ecosystem as IoT agents [10]. These are flexible components, specifically configured or developed for each data source, considering its original communication protocol (LoRa, MQTT, etc.) and the data format (text, JSON, UltraLight, etc.). They implement the NGSI adaptation layer, and its flexibility aims at integrating new data providers, such as vehicle manufactures or connected car infrastructures.

  • The context management and storage enablers, with the Context Broker as its core. This layer covers data management, data protection, and data processing functionalities from the INFINITECH RA, linking with INFINITECH enablers. It stores and distributes the collected information, once standardized, supporting, among other options, the NGSI-LD protocol for context data and time series retrieval. In addition, they implement SQL and PostgreSQL native protocols to move large numbers of records with elaborated database requests from external layers.

Fig. 17.2
figure 2

Pilot 11 data sources and data gathering process

Within this smart fleet framework, collected information is homogenized and normalized according to predefined FIWARE data models, based on its NGSI protocol. This way, smart fleet intends to be aligned with the current Connecting Europe Facility (CEF) [11] program, fostering IoT standardization and integration. The models currently used and enhanced in collaboration with the smart data model [12] initiative are:

  • Vehicle [13] model to map all (real and simulated) data captured from connected cars.

  • WeatherObserved [14] to capture data from AEMET’s weather stations (area of Vigo’s city).

  • Alert [15] that represents an alert or event generated in a specific location. It is used to get data from DGT’s reported traffic events (area of Vigo’s city).

  • Road [16]and RoadSegment [17] that map information from roads and lanes (captured from OpenStreetMap [18]) where both simulated and real connected cars are driving.

Linked to the context information and storage components, and based on Grafana [19] tools, the smart fleet framework implements a set of customized dashboards designed to show the data gathering process. An example of these dashboards is shown in Fig. 17.3, representing the last datasets reported by the connected cars, their location, the corresponding speed for last reporting vehicles, and the table with a subset of the captured raw data.

Fig. 17.3
figure 3

Smart fleet dashboard showing latest reported data from vehicles

5 Driving Profile Model

Within the context of the INFINITECH project, the objective of Pilot 11 is to improve the vehicle insurance tariffications by using AI technologies, allowing customers to pay according to their driving behavior and routines. To support this, the concept of driving profiles will be used and developed. By gathering the data captured on each trip from a vehicle (speed, braking force, indicators’ use, etc.), as well as the context information (weather data, traffic alerts, time of the day, road information, etc.), it is possible to group drivers into several clusters, as grounds of further profiles. Each of these profiles can later be analyzed and checked for consistency among different routes, providing critical info about the actual use of each insured vehicle.

Figure 17.4 attempts at depicting the dataflow followed throughout the pilot’s AI framework to achieve the proper clustering of the different driving profiles. This section will aim at explaining the different parts of the full algorithm, as well as giving insight at how some of the phases could be improved.

Fig. 17.4
figure 4

Data clustering process

First, datasets are served by the smart fleet framework. This information is aggregated and injected directly into the ATOS-powered EASIER.AI platform [20], the pilot’s AI proper framework, by using a specific connector developed on top of the smart fleet APIs. This will greatly help with the data gathering process: to retrieve vehicles’ data and context information from the smart fleet by exploiting its APIs in a seamless and simple way. This data will be properly encapsulated into JSON format files to be later retrieved by the service.

Once the data has been properly loaded, it undergoes several data engineering stages. First action is to ensure that all the data is complete and does not present wrong or empty values. Then, an analysis phase is entered, where the most valuable features for the matter at hand are selected and extracted. Currently, the used features are:

  • Average trip velocity

  • Trip distance

  • Trip duration

  • Mean CO2 emissions

  • Mean fuel consumption

  • Mean acceleration force

Last on the data engineering phase, the data is normalized to ensure that all value ranges are the same. This proves especially useful when dealing with features as different as distances and velocities, whose value ranges could bias the model in a wrong way.

When the data is already fit for inputting into the model, we will proceed to pass it through an encoder block in order to reduce its dimensionality. This action will improve the clustering algorithm’s performance that will be trained as the final step. In order to achieve this dimensionality reduction, an AI autoencoder component, exploring linear autoencoders [21], will be trained. This type of unsupervised artificial neural network is illustrated in Fig. 17.5. It takes the same data as input and output, resulting in a model that recreates the injected values. This by itself does not prove very useful in this context, but by taking several of its layers, we can find an encoder block that outputs the data with a different number of dimensions.

Fig. 17.5
figure 5

Schematic structure of an autoencoder with three fully connected hidden layers [22]

Once the autoencoder is trained, the encoding layers are taken, and all the data are passed through it. These data, now scrambled and with fewer features, are sent to our K-means clustering algorithm [23], which will separate each driver’s information into different clusters, according to the parameters of each route. Once the model and algorithm have been trained, the different clusters can be characterized and assigned to a driving profile, so they can be served in order to classify new drivers and routes.

The pilot will study different customizations of the mentioned sifting stages, encoding and clustering algorithms that produce the AI driving profiling model, testing this with tagged routes (real routes and simulated routes) to refine its accuracy and support the final Pay How You Drive and fraud detection services.

6 Customized Car Insurance Services Powered by ML

As it was mentioned in the first section of this chapter, the problem in motor insurance and in every insurance sector is accurate risk assessment, and the solution to this problem is usage-based insurance (UBI) products and services. Pay As You Drive and Pay How You Drive insurance products exist in the insurance market, but their level of integration is significantly low. One of the services developed within the pilot is a Pay How You Drive insurance product with higher level of accuracy than the existing ones, by exploiting, apart from the numerous data provided by the On-Board Unit connected to car’s OBD-II, context information including weather conditions and traffic incidences. The innovation this pilot brings in the insurance business will make the services developed quite desirable.

As in all the UBI products, insurance premiums are not static and predefined based on historical data, but based on the actual usage and as consequence on the actual risk. Insurance companies will have the opportunity to provide to their clients personalized insurance products based on connected vehicles. Having all these data gathered as mentioned in previous sections and using artificial intelligence (AI) techniques and machine learning (ML) algorithms, there will be created different driving profiles from conservative and considerate drivers to aggressive, very aggressive, or extremely aggressive drivers. The more the driving profiles created by training the algorithms with real and simulated data, the more accurate will be the classification, and the more accurate will be the reflection of this classification in the premiums.

In the following lines, there will be presented some of the numerous scenarios to emphasize how important in creating driving profiles and classifying drivers in these profiles is the combination of data collected from the vehicles and data gathered with context information:

  • Starting from some simple scenarios like driving above speed limit and making a turn or changing lanes without using blinkers, where depending on the deviation of the speed limit or the percentage of times using and not using blinkers the drivers can be classified in the different profiles.

  • Another scenario may be the frequent change of lanes in a straight segment of the road, classifying drivers according to how many times they do so in a specific distance driven.

  • Then, we have scenarios combining weather conditions with the driving behavior monitored, for example, driving with relatively high speed in severe weather conditions and driving with lights turned off after sunset or in low visibility conditions.

  • Finally, using the data collected from vehicle’s OBU and traffic incidences, we can built a scenario like driving without reducing speed when approaching area with a car accident or heavy traffic.

Having available all these data gathered and collected, it is easily understood that these are only few of the scenarios that can be exploited to generate different driving profiles.

Another problem to deal with in motor insurance is the fraud detection in a car incident, such as an accident or theft. So far, the means insurance companies have to detect fraud in a claim are confined to:

  • Those related to police investigation (police reports, eyewitnesses, traffic control cameras, court procedure)

  • Those related to experts’ opinion, for example, experts or even investigators may find inconsistencies by examining the cars involved in an accident or the drivers involved

  • Those related to social media (there are cases where a relationship between the drivers involved in a car accident is revealed or even a video with the actual accident and moments before and after the accident is uploaded)

Obviously, the data gathered and collected within the pilot can be used to provide to the insurance companies one more service, the fraud detection service. Here are some use cases that insurance companies can detect fraud with the process of all these data:

  • The vehicle reported to be involved in the accident is different than the real one.

  • The driver reported to be involved in the accident is different than the real one or different than the usual driver of the vehicle.

  • To report an accident in a different location than where it occurred.

  • To detect speed and driving reactions before the moment of the accident and generally all the circumstances of the accident.

  • To detect the driving behavior in a period before the accident.

  • Last but not least, to detect fake car thefts.

Some of these can even be detected with a simple GPS tracker, certainly not all of them, and some may need to be considered in forthcoming legislations, but this is the challenge of this service, and the result will be beneficial for the insurance companies, the insureds, and the society in general, by assuming responsibilities correctly, paying claims according to the real responsibility, and avoiding fraud.

7 Conclusions

Calculating the risk of the insured object is one of the most delicate tasks when determining the premium for car insurance products. Traditional methods use common statistical data, providing general and sometimes biased results. The connected car scenario provides the possibility of getting more precise data about the vehicle, the driver, and its context, which offer new ways to improve driving safety and risk estimations.

With this idea, personalized usage-based insurance products rely on real-world data collection from IoT devices and frameworks to tailor novel services adapted to actual risks and driving behaviors. The presented pilot within INFINITECH ecosystem explores new ML/DL approaches to exploit datasets from vehicles enriched with new and relevant context information and so and develop AI models that enhance the risk assessment for insurance services. This specialized data gathering process and the applied AI technologies support the business innovation that allows insurance companies to evolve the way they offer premiums and prices, according to their insured clients’ real profile instead of using general statistics. In this line, the pilot implements a complete infrastructure to develop AI models and services for car insurance providers so the proposed business models, Pay How You Drive and fraud detection, can be explored. Real datasets captured from CTAG vehicles plus the synthetic and simulated data reported by our simulation environments are enough to get a first version of the driving profiling AI model, but for expanding this solution and obtaining evolved and more accurate outcomes, there are some challenges that must be considered, mainly related to three issues:

  • Datasets and data sources: the wider the set of relevant data captured, the better the model will be. This requires involving as many connected cars as possible so the whole system proposed here relies on automotive and IoT standard, trying to make the integration easier for new car manufactures and IoT providers. Extra efforts should be done also to identify new sources that impact on driving profiling and remove those datasets from the AI modeling that have less relevance.

  • Accuracy of the driving profiling, key factor that should be improved by adapting ML/DL techniques, reinforcing training, and evaluating the impact of anonymization.

  • Evaluation and adaptation of the services offered to both insured clients and insurance companies, by involving related stakeholders on the new versions’ development.