1 Introduction

In the framework of national Italian economic life, the sector of third party liability insurance (from now on: RCA—for Responsabilità Civile Automobilistica—insurance) is of singular importance: both for its overall weight (premiums paid annually in Italy are of the order of 15 billion of euro), and because, with approximately 35 million policyholders (from now on: p.h.s), the topic is deeply felt on a social level. So much so, that it has suggested impromptu interventions, in our opinion amateurish or demagogic: I am referring to the “direct compensation practice” (the insurance covers the damage a p.h. causes, but the Company must directly indemnify, within certain limits, also the ones he/she suffers: this enormously complicates the already difficult problem of pricing), or to the rule according which, if in a family there is a virtuous motorist, all the members get very favorable conditions when they sign up for the insurance on their turn.

It is the personal writer’s opinion that even the behavior of the Companies is highly questionable, as regards technical rigor and compliance with the most elementary precepts of insurance science. Companies love to hide behind the principle of free enterprise: it seems difficult to accept the fact that the Italian Republic can control the price of credit (with the anti-usury law) but not that of a service the same Republic has made obligatory to buy.

Here, in summary, are the observations I intend to develop.Footnote 1

The need, not only commercial, to personalize tariffs is often pushed to technically unacceptable levels: some risks are sometimes insured, but they shouldn’t be, because the lack of adequate data prevents from evaluating them correctly.

The method commonly used to construct the personalized premium, the “multiplicative” one, based on the breakdown of risks into several factors, ignores—in its basic form—the correlations between them, and therefore provides distorted results: this has been known for some decades,Footnote 2 and statistical methodologies are actually available (I cannot say how often they are used) to overcome this difficulty. Much less well known is the fact that, anyway, algebraic reasons make it impossible to correctly price, with the multiplicative method, the great variety of risks that the Companies offer to distinguish. The method itself also presents, as we will see, serious problems from the point of view of the commercial transparency, and from that of the managerial prudence. Ultimately, it is not possible to understand if a risk is assessable, why it is not assessed directly; being to reject the answer that, with all its defects, the multiplicative method is the only one that allows to price directly risks not sufficiently known. Risks of this type should simply not be insured in their specificity.

Among the risk factors, the one measured by the (I assume, well known) “Bonus–Malus systems” deserves special mention. Mechanisms of this kind all present, in an accentuated form, the defects connected with the other risk factors: deviation between tariff coefficients, almost entirely arbitrary, and technical coefficients, very difficult to evaluate reliably; evident, I would say provocative, correlation with the other personalization factors. The standard Italian one, still a mandatory reference term in the country, adds to all this a largely proven inefficiency for the purpose for which these systems have been imagined: extracting information from the individual history of each p.h.. I completely agree on the opportunity to take adequate account of such a history; however, the theory has developed (for decades) tools that allow to do it in a much more rational way.

B–M systems only consider one’s reported accidents. We won’t deal here with the idea, we find exposed in Pinquet [4], of putting into play also the infractions to the driving rules (or at least to some of them: in most cases, parking illegally should not be considered as a dangerous behavior). This gives of course a better knowledge of the individual profile. On the other hand, adding a new variable, what is more: strongly correlated with the "number of claims" one, cannot fail to accentuate the defects briefly listed above.

2 Personalization in RCA: a necessity not only commercial, but with some limits

The so-called personalization of the tariff for RCA insurance, i.e. the diversification of premiums by even very fine categories of p.h.s, represents a need from a commercial point of view. If a single Company offers lower-than-average prices to particularly virtuous motorists, the others must imitate it: under penalty of suffering from anti-selection of the portfolio.

Customization is also suggested by considerations of a completely different type. The “individual claims” variable has a very high variance: the application of the same average premium for all is therefore a rather crude solution. At the same time, in front of the risk of causing accidents, the reference to the concept of solidarity is competely out of place. If mutuality between homogeneous individuals is the basic principle of insurance, and the fact that the fortunate pay for the unfortunate is morally acceptable, or even desirable, solidarity between different risks must be seen in a very different way. That “good” drivers should pay for “bad” ones is an indefensible assumption. What’s more: if solidarity helps people with incorrect driving habits not to pay for all the consequences of their actions, it should be seen as a sort of aiding and abetting in a criminal activity.

But also solidarity in the other direction (from the bad drivers in favor of the good guys) has aspects that make it inadvisable. Asking a “bad” p.h., who is already fairly entitled to an above-average premium, to pay an even higher one, can have an educational effect, but means encouraging him to leave the insurance. And it is known that this phenomenon has grown to worrying dimensions in Italy: it is estimated that it concerns 6% of the car drivers.

Apart from all this, however, there is a managerial reason which should suggest avoiding any form of solidarity. A tariff according which some p.h.s pay a premium lower than the fair one, and others a higher, is sustainable only if what is paid less by the first exactly balances what is paid more by the second. On the other hand, trivially, such a tariff is attractive to the advantaged and not to disadvantaged. It is therefore difficult to hypothesize that the market’s response will be able to guarantee equilibrium: many will want to profit from the low prices, but how many will accept to pay the high ones?

It remains to note that, if personalizing (I mean: letting every one pay his personal fair price) is desirable, there are some limits to the possibility of achieving such a goal. There is namely the danger of contradicting the other basic principleFootnote 3 on which the possibility of practicing insurance is based: every risk must be known statistically, or it is not possible to estimate its cost. In order to be insured, a risk must therefore occur “in nature” a sufficiently large number of times. One cannot insure risks that do not belong to large enough groups, simply because it is impossible to price them.

Just on the contrary, it happens that some Companies push personalization too far, offering products for which, given their extreme specificity, they are unable to determine the right price; with obvious consequences not only in terms of commercial correctness, but also in terms of their profit prospects, or even survival (an aspect, to which they should be even more sensitive).

I know, of course, that “insurance” against earthquakes or nuclear accidents exists, also in absence of satisfactory statistical data.Footnote 4 But I observe that in these cases our statistical ignorance is, say, objective: those events happen “too rarely”. In RCA the ignorance is, on the contrary, artificially produced by the subjective choice of narrowing too much the field of observation, which ends up consisting of only a few individuals. Buying RCA insurance is mandatory: why do Companies build up systems in such a way, that customers don’t know if what they are asked to pay is “fair”?

3 The multiplicative method: the technical coefficients

The personalized tariffs in use are based on the assumption that the riskiness of an individual is linked to his personal characteristics (customization variables, or risk factors). Some are observable a priori (age, place of residence,Footnote 5 …); another, which we could call the individual attitude to causing accidents, is revealed only by the behavior observed year after year: that is, by the number of accidents reported over time.Footnote 6 It is therefore qualified as “a posteriori”.

With respect to each variable, the p.h.s are divided into as many classes as many different ways it can present itself; there may be, for example, the forty-years-aged class, that of Avellino residents, that of people belonging to the fourth Bonus–Malus class,Footnote 7 …. If a Company chooses to use n variables, each p.h. belongs to as many classes at the same time. Together, they determine his/her personal risk profile, which is described by a vector with n components:

$${\mathbf{p}} = \, \left( {p_{{1}} ,p_{{2}} , \, ...,p_{n} } \right)$$
(1)

the generic one, α-th, of which specifies that, with respect to the α-th variable, he/she belongs to the pα-th risk class; for the a posteriori variable, the specific names of merit, or B–M class are also in use.

The personal profile has a one-year validity: some, of all its components may vary as time goes by: but the span of our analysis is a single exercice. All premiums are meant as yearly, and the Company fixes them in order to guarantee the balance for that year.

Suppose that the members of class j with respect to the variable α are numerous enough that the fair pure premium for them (let’s denote it by fα(j)) can be calculated, and that this is true for all variables and classes. The ratios between these “individual” premiums and the general average pure premium (ap: from now on, base premium) define the scale of univariateFootnote 8technical coefficients (or factors) for that variable. I will denote them with tα(j):

$$t_{\alpha } \left( j \right) \, = f_{\alpha } \left( j \right)/ap.$$
(2)

The single tα(j) measures how much a p.h. who, with respect to the α-th variable, belongs to class j, is more (or less) “sinister inclined” than the average, and how much he/she must therefore fairly pay more or less than the average.

The pure personalized premium for a p.h. is calculated as the product between the base premium and the compound customization coefficient relating to his/her profile, defined on its turn as the product of the n different technical coefficients and denoted by mt(p). The formulas are:

$$mt\left( {\mathbf{p}} \right) \, = t_{{1}} \left( {p_{{1}} } \right) \, \cdot t_{{2}} \left( {p_{{2}} } \right) \, \cdot \cdots \, \cdot t_{n} \left( {p_{n} } \right)$$
(3)

for the compound customization factor, and

$$mt\left( {\mathbf{p}} \right) \cdot ap$$
(4)

for the pure, “multiplicatively personalized” prize.

An important observation. The fact that formula (4) is often used by substituting the average gross premium for the pure one (ap) to get the final prize, must be considered as a—can I say?—gross abuse: in this way, in fact, even the fixed costs are “personalized”. What is in no way justifiable.

It remains established that, according to what was recalled in par. 1, insurance coverage for an individual with risk profile p can be correctly sold only if adequate specific statistics are available; that is, if the Company is able to correctly calculate the relative fair premium fp(p). But then the question arises spontaneously: why this price is not applied directly, and attempts are made to reproduce it multiplicatively, with all the problems I am about to list? The answer that this method would be more transparent or more marketable, because the Company cannot publish/apply a tariff that provides for hundreds of particular cases, must be rejected as totally specious.

It therefore seems that Companies choose to reason in terms of univariate risk factors to overcome the problem of the insufficiency of data. Those relating to the accident rate of, say, Avellino motorists, or of 40-years-aged, or of p.h.s belonging to the fourth B–M class, are certainly much more reliable than those of people who present the three characteristics simultaneously: the attempt is to obtain, say, a multivariate knowledge using univariate datas.

Unfortunately, the operation of combining the various factors presents many difficulties.

The elementary solution described above, which consists in multiplying the coefficients relating to each single variable, seems the most obvious: but it is also, obviously, wrong, because it completely neglects the correlations that do exist between the variables. It is trivial to observe that in the various Italian provinces the distribution of ages is different, and that of income is different as well. As for the “class B–M” variable, it is—I would say—provocatively correlated with all the others: in fact, if one thinks that the variable, for example, age somehow “informs” about the individual riskiness (otherwise, why taking it into consideration?) it is obvious to expect that the more virtuous ages are more present in the upper rungs of the B–M ladder, the less attentive ones in the lower rungs. It is equally trivial to observe that if two characteristics tend to occur together, considering them separately (as the two factors of a product) means penalizing (or rewarding) a p.h. beyond what is due: the same “defect” (or “quality”) is in fact counted twice. The method therefore undoubtedly provides distorted and unfair tariffs. If we define the technical (say: “direct”, not multiplicative as mt(p) coefficient for profile p by

$$t\left( {\mathbf{p}} \right) = fp\left( {\mathbf{p}} \right)/ap$$
(5)

we have, in other words, that

$$t\left( {\mathbf{p}} \right) \, \ne mt\left( {\mathbf{p}} \right)$$
(6)

so that fp(p) is not equal to the quantity (4).

There is, in reality, widespread awareness of this fact; sometimes accompanied by an attitude of resigned acceptance (but see, as a partial correction, what I will say at the end of the paragraph). Much less known is that, in the presence of correlations between the variables, it is in any case impossible to determine coefficients capable of reproducing, via multiplication, the correct premium for each risk profile. For a formal demonstration, I refer to the article cited in note 1; I limit ourselves here to point out, very roughly, that the different risk profiles contemplated by a tariff often amount to some hundreds, and the corresponding premiums are calculated by multiplying, at most, a few dozen coefficients. Algebra does not allow, except in particular cases (which do not occur here) to find “few” numbers (the coefficients) which simultaneously satisfy “many” conditions (= correctly reproduce the many different fair prices).

The conclusion is that the multiplicative method is “constitutionally” incapable of providing correct rates.

Let’s go back, to conclude the part, to the problem of the correlation between the customization variables, which I said is elegantly ignored by many companies. It must be noted that some, on the contrary, pose themselves it, and try to solve it by resorting to various statistical methodologies: I only mention the “generalized linear models” (“GLM”), which produce sets of coefficients calculated so as to take into account all the correlations between the variables used. Such coefficients have nothing to do with the “technical” ones mentioned above, but have the nature of “company coefficients”: a concept which I will discuss in the following paragraph.

The said procedure certainly generate qualitatively better tariffs than those based on the naive methode seen above; but note that they are of the multiplicative type and therefore destined, from the outset, to fail (“failure”, if the aim is to produce the fair premium for each risk). Furthermore, the basic logical misunderstanding remains completely unresolved: by their nature, they produce systems of multiplicative coefficients capable of “better approximating” the fair premiums which are supposedly known (or the procedures themselves are not applicable), but which then it is fault not to use directly.

It is sometimes arguedFootnote 9 that a multiplicative model provides a method to estimate prices not directly known. Unfortunately, that model is used for all the risk profiles: also the ones for which the fair premium may be calculated directly, and is in this way substituted by an “unfair” one.

4 The multiplicative method: commercial (company) coefficients, and their normalization

I have illustrated in the previous paragraph how the Companies which use, for example, GLMs apply the—regrettable—multiplicative method using, as risk factors, quantities other than the technical coefficients directly obtainable from the statistical data.

It happens that all the Companies, also the ones that don’t resort to GLMs, actually choose to do the same. It happens, I mean, that no Company uses the formulas (3)–(4) as we wrote them, but all start by substituting the technical, objective coefficients tα(j) with different, subjective quantities, which undergo the names of commercial, or corporate, or tariff, or (finally) risk coefficient, or factors, and I will denote by cα(j). It is, it looks, a question of marketing strategy. The coefficients produced by GLMs are differently motivated, but have the very same nature.

This choice implies some delicate consequences.

In fact, the application of the tα(j) ensures that, in principle, each p.h. pays his/her fair premium (then, admittedly, the multiplicative procedure spoils everything). But if company coefficients other than technical are used, one starts off wih the wrong foot. It is inevitable that the p.h.s of some classes find themselves paying less, those of others more than they should. Be it wanted or not, some measure of that solidarity which I have said should be avoided pops up; and, with it, the problem that if there is not an exact balance between the discounts granted and the increases imposed (discounts and increases with respect to the fair measure), the balance is at risk.

To overcome not the problem of the solidarity, but the risk of unbalance, Companies apply the multiplicative customization factor not, plainly, to the general average pure premium ap, but to a purposely modified one; or, what resorts to the same, they modify their own coefficients with a procedure of "normalization" I now describe.

Consider, to begin with, the just theoric case of a single personalization variable: then the risk profile reduces to one risk class, the (now, useless) index α can be dropped, and all the preceding formulas simplify significantly. For a portfolio that contains h(j) p.h.s of class j, a Company that just applies (3) and (4) with its commercial coefficients at the place of the technical ones will cash

$$C \, = \sum c\left( j \right)h\left( j \right)ap$$
(7)

but will have to pay (predictably, in mean)

$$P = \sum t\left( j \right)h\left( j \right)ap.$$
(8)

To obtain the balance, it will therefore calculate the individual premiums using not the officially announced coefficients c(j), but the "normalized" coefficients nc(j), defined as

$$nc\left( j \right) \, = c\left( j \right)P/C.$$
(9)

Namely, the Company’s revenue will then be not (7), but

$$\sum nc\left( j \right)h\left( j \right)ap \, = \frac{\sum c\left( j \right)h\left( j \right)ap}{{\sum c\left( j \right)h\left( j \right)ap}} P = \, P.$$
(10)

Alternatively, one can think of modifying, by the same factor P/C, not the coefficients c(j) but the base premium ap: the final effect is, trivially, the same, but the method is more presentable (the tariff coefficients are officially announced, the base premium is not).

A little digression. It is sometimes theorically proposed, and/or actually adopted, a different way to normalize the commercial coefficients. It consists in multiplying all the c(j) by a factor K determined by the condition that the resulting overall portfolio mean personalization coefficient be equal to 1; that is, by the equation

$$K\sum c\left( j \right)h\left( j \right)/\sum h\left( j \right) = {1}.$$
(11)

Instead of (9) we have therefore, for these “differently normalized” coefficients, the expressions:

$$nc^{\prime}\left( j \right) \, = c\left( j \right)\frac{\sum h\left( j \right)}{{\sum c\left( j \right)h\left( j \right)}}.$$
(12)

In this way, the Company’s income amounts to

$$\sum nc^{\prime}\left( j \right)h\left( j \right)ap = \sum c\left( j \right)\frac{\sum h\left( j \right)}{{\sum c\left( j \right)h\left( j \right)}}h\left( j \right)ap = \sum h\left( j \right)ap$$
(13)

that is, exactly what it will pay if all its p.h.s will cost ap, independently on their different qualities: which looks to be an indefensible hypothesis. On this ground, I will consider the first method, the one described by (9), the only reasonable one. Here ends the digression.

The practice of normalization can be judged rather questionable (some comments follow), but is at all customary. As I said, it is meant to overcome the damages that passing from the technical to the commercial coefficients may provoke on the safety of the Company. To have a quantitative idea of its incidence, consider that the factor necessary to “normalize” (or, maybe better, to correct) the merit coefficients of the standard Italian B–M system if we imagine to apply it to an average portfolio, is of the order of magnitude of 2. The ones referring to the a priori variables are, admittedly, smaller: more on this in par. 5.

Now, the comments I promised.

There is, to begin with, a relevant problem of transparency (not to say: of commercial fairness). The Companies publish, or at least list in the official reports required by law, the tariff (commercial) coefficients, but actually use the, different, normalized ones. In another form: they promise to certain categories some discounts with respect to the average premium; but, before applying them, they increase (substantially, and let’s not hesitate to say: secretly) the measure of the premium on which the discounts are calculated.

The second problem, to which the Companies should be more interested, is represented by the fact that the coefficients normalization is a proceeding that regards a known portfolio: if the h(j) are not known, the factor P/C can not be calculated [nor can the Eq. (10) even be written]. On the other hand, the procedure must be performed when the tariffs are designed. So those essential quantities can only be forecasted. If the market response will be different from the imagined one, the imbalance is inevitable.

What precedes, ambiguity on the normalization procedure apart, must be considered (ignored, but) well known. On the contrary no one, as far as I know, has ever pointed out that, to have the required effect, the normalization must be carried out not separately for each set of coefficients, what’s customary to do, but simultaneously on the whole of those used.

Suppose that the Company normalizes separately the scales of the n commercial coefficients it has chosen to employ. Then the final compound customization factor for the risk profile p results

$$mnc\left( {\mathbf{p}} \right) = nc_{{1}} \left( {p_{{1}} } \right) \cdot nc_{{2}} \left( {p_{{2}} } \right) \, \cdot \, \cdots \, \cdot nc_{m} \left( {p_{m} } \right)$$
(14)

If the p.h.s with risk profile p are in number of h(p), the Company will then collect

$$\sum mnc\left( {\mathbf{p}} \right) \cdot h\left( {\mathbf{p}} \right) \cdot ap$$
(15)

but will have (meanly) to pay

$$\sum t\left( {\mathbf{p}} \right) \cdot h\left( {\mathbf{p}} \right) \cdot ap$$
(16)

(sums are over all different risk profiles, present in the portfolio; formulas (15) and (16) are the “multi-variables” versions of (7) and (8), respectively). There is no reason at all to hope that quantities (15) and (16) coincide; the normalization necessary to guarantee (meanly!) the balance is obtained by multiplying the base premium ap or, alternatively, all the scales of commercial coefficients by the quantity given by the expression

$$\sum t\left( {\mathbf{p}} \right) \cdot h\left( {\mathbf{p}} \right) \cdot ap/\sum mc\left( {\mathbf{p}} \right) \cdot h\left( {\mathbf{p}} \right) \cdot ap$$
(17)

Useless to emphasize that this can be done only if the h(p) are known before the tariff is presented to the market; what is, say, rather improbable. Anyway, it doesn’ look that Companies even try to do it: they content themselves, as far as I know, with normalizing separately the various scales of coefficients, and then calculating their product.

5 The B–M variable

Among all the personalization variables, the “a posteriori” one deserves separate considerations. It is—very appropriately—present in all the tariffs I know. Namely, it directly reveals what the “a priori” variables only try to predict on a statistical basis, but should of course be essential to know exactly: the individual “dangerousness”.

It is customary to take account of the p.h.’s history through a “Bonus–Malus” type mechanism,Footnote 10 the functioning of which I assume to be known (apart from the specific details of the system devised by each Company: number of classes, evolutionary rules among them, merit coefficients,Footnote 11 …). The possible determinations of this “a posteriori variable”, or merit classes, correspond to the different steps of the B–M scale.

Everything illustrated in par. 3 and 4 can be repeated here, in accentuated form. The variable B–M is obviously correlated with all the a priori variables (I have already observed this); and the discrepancy between the merit coefficients officially promised by the standard Italian system or by each of the Companies, and the normalized ones, actually used, is much more striking than that relating to the other variables. The question is that, rather ironically, the coefficients for the a priori variables are chosen “a posteriori”: that is, starting from really observed data. On the contrary, the merit coefficients (the ones concerning the a posteriori variable) look to be chosen “a priori” in a fancy (not to say, almost completely arbitrary) way; it ends therefore up that the ones applied to the different classes of the system have little to do with the real riskyness of the p.h.s inserted in each of them. Normalizing them is, simply, essential; but this means strongly modifying them. Speaking very loosely, Companies promise that “very good drivers” will pay “a half of the premium”, but neglect to say that this half will be calculated on an almost doubled prize.

It should be added that the “quality” of the mechanisms adopted, i.e. their ability to correctly discriminate risks, is generally very poor. There is almost always a strong concentration of p.h.s in the upper classes (those of “Bonus”), to be considered pathological: the great majority have annual claims frequencies close to the average, and should therefore stabilize instead around the central classes. In a series of unpublished works dating back to around 2010, Verico studied the Italian standard B–M system, still a mandatory reference term for all Italian contracts. She calculated in 0.56 the average coefficient of merit of a portfolio to be considered sufficiently representative. Such a value implies that, in front of the scale of 18 promised merit factors (ranging from 0.50 to 2), factors from 0.9 to 3.6 are actually applied; and there are not, as the tariff promises, 12 classes entitled to a discount on the average premium, but only 3. These results must be considered substantially valid even today, in the presence of an average frequency of claims much lower than then (the Italian claim frequency was 8.3% in 2010, 4.71% in 2021).

I of course accept the idea that a B–M mechanism works in the sense of discouraging the reporting of small accidents: but this is, unfortunately, a very difficult effect to measure. Nor I exclude that efficient B–M systems can be designed; but please remember that they fully belong to that multiplicative “philosophy” whose validity I am contesting. I prefer to recall that for many decades there have been theoretical results of easy applicability, which solve the problem of reconciling a priori information with that deduced from observation in a brilliant and indisputably correct way. I mention what is perhaps the best known.Footnote 12 In the hypothesis, widely used in theoretical studies, that the number of accidents reported annually by a p.h. depends, according to Poisson’s law, on his individual frequency, a parameter which is in its turn distributed according to a Gamma distribution, the average value of that variable for who in n years has reported k claims is given by the disarmingly simple formula

$$\frac{a + k}{{b + n}}$$
(18)

a and b being two fixed parameters: which can be common to all (and therefore relating to the entire universe of potential p.h.s), or differentiated on the basis of one or more a priori customization variables (e.g. by geographical area). The product of the quantity (18) by the average cost of the claim obviously gives the personalized pure premium; in which it is easy to observe how the “weight” of the fixed part diminishes over time until, in the limit, it vanishes. In the very same logic, Constantinescu [1] proposes some similar formulas that take into account not only the number, but also the severity of the claims.

6 Conclusions

My rather crude conclusion is that if the fair price for a risk profile is known, it should be plainly applied; if it is not, that risk shouldn’t be insured. RCA insurance is, in many countries, compulsory: in a market where customers are forced by law to buy, the sellers can (I think) accept that also their rights be somehow reduced.

Apart from this, the technical approach to the problem of pricing is by many points of view strongly questionable.