A comprehensive model for cyber risk based on marked point processes and its application to insurance

After scrutinizing technical, legal, financial, and actuarial aspects of cyber risk, a new approach for modelling cyber risk using marked point processes is proposed. Key covariates, required to model frequency and severity of cyber claims, are identified. The presented framework explicitly takes into account incidents from malicious untargeted and targeted attacks as well as accidents and failures. The resulting model is able to include the dynamic nature of cyber risk, while capturing accumulation risk in a realistic way. The model is studied with respect to its statistical properties and applied to the pricing of cyber insurance and risk measurement. The results are illustrated in a simulation study.


Introduction
Researchers and practitioners from different disciplines have analysed 'cyber risk' and 'cyber insurance' from their provenience, among them IT system experts, economists, statisticians, actuaries, etc.; a recent survey of the literature on these topics in business and actuarial science is provided in Ref. [1]. Despite the lack of an established agreed-upon framework, all stakeholders share the opinion that cyber risk is on the rise. This is substantiated by continuously changing and expanding cyber threats [2] and an increasing frequency and magnitude of the financial consequences of cyber incidents [3][4][5][6]. The potential consequences of cyber incidents 1 3 have also been prominently covered by the media; examples being [7][8][9]. This has lead to various cooperations between academia, industry, and government agencies (e.g. CISA 1 in the US or CiSP 2 in the UK) with the aim of developing defense strategies against cyber crime and enhancing the overall resilience of IT networks. Corporations have moved their perception of cyber security from a merely technical topic to a larger business risk [10], but this awareness does not yet seem to have translated into widespread institutionalisation of cyber risk management [11]. A recently highlighted aspect is the connection between Cyber Incidents and Business Interruption, which jointly ranked as the top global business risks in a 2019 survey [12].
One strategy to cope with risk is risk transfer, e.g. via insurance contracts. Parallel to the risk, the demand for cyber insurance solutions has been continuously increasing [10] over the last few years. In spite of its growth, however, today's cyber insurance market is still relatively small compared to the value of the assets that could be impaired by a cyber event [2]. Barriers are not a lack of demand for cyber risk transfer, but rather a number of obstacles that complicate the understanding and quantification of the underlying risk, including the lack of solid data on losses, a fast-paced evolution of cyber risk, and the disparity of data protection laws globally [4,13].
Despite these challenges, especially in the US an existing market is already established; including underwriters, brokers, and organisations specialized on cyber data analytics [14]. Concerning the pricing of cyber risk, however, a surprising finding was published by Romanosky et al. [15]: they systematically analysed cyber policies across the US and found that the main themes used for pricing included looking to competitors and estimation/guesswork. The ad-hoc nature of cyber policy pricing confirms that a unified quantitative understanding of this new type of risk and its underlying drivers is still at its infancy.
The cyber risk model developed in the present work, designed from an actuarial point of view, constitutes a threefold contribution: 1. The model is based on a holistic approach to cyber risk, systematically describing the underlying risk factors while including information-technological, economic, and actuarial viewpoints. 2. The model is able to capture dependencies and accumulation risk in a realistic way by explicitly taking into account idiosyncratic cyber incidents and systemic cyber events. 3. Using the loss distribution approach, the model can easily be applied in an insurance framework. A simulation study illustrating this application is included.
The remainder of this paper is structured as follows: Sect. 2 carefully reviews the existing literature on cyber risk and identifies key findings for an actuary. Section 3 presents a holistic view on cyber risk, including key characteristics and risk factors.
1 3 [35][36][37]) or on embedding cyber into the better-known context of operational risk management (e.g. Refs. [38,39]). One feature of cyber risk that is commonly regarded as particularly problematic is the lack of independence among the risks/claims, a problem that was addressed using copula approaches in Refs. [40,41], linear correlations in Ref. [24], and a combination of both in Ref. [23]. More recently, Peng et al. [42] studied the multivariate dependence exhibited by real-world cyber attack data using a Copula-GARCH model. The latter works describe cyber attacks of different types or multivariate nature to be the source of dependence. Peng et al. [43] propose modelling and predicting extreme cyberattack rates by using marked point processes and similarly, studying an empirical data set of breach incidents [44] argue that stochastic processes rather than distributions should be used to model and predict hacking breach incident inter-arrival times and breach sizes. Baldwin et al. [45] find strong evidence of contagion in cyber attacks to different components of a firm's information system using self-and mutually-exciting point processes.
Instead of considering underlying attack rates, studies concerned with cyber insurance seek to quantify the expected monetary losses of an insurer's portfolio. To this end, dependencies between losses can also be captured by considering a model of epidemic spreading on the underlying network of firms. Fahrenwaldt et al. [46] use a (Markovian) SIS-process to model the infectuous spread of a cyber vulnerability and subsequently an adapted counting process for the occurrence of attacks. Xu and Hua [47] use Markovian and Non-Markovian processes for epidemic spreading and propose to use a copula approach to capture the dependence among time-toinfection distributions. Xu et al. [48] study a model of cyber epidemics over complex networks, additionally introducing copulas to capture dependencies between cyberattack events.

Data-driven studies
The lack of publicly available, reliable, and sufficiently large data sets for cyber incidents remains one of the obstacles for sound statistical investigations. Among the best-known data sources on data breaches are the continuously updated "Chronology of Data Breaches" dataset by the California-based nonprofit corporation Privacy Rights Clearinghouse (PRC) 3 and the "Open security foundation data loss database". 4 The former data was e.g. studied by Edwards et al. [49], with the conclusion that the number of records exposed can be modeled by the log-normal law and the daily frequency can be described by a negative binomial distribution. Somewhat surprisingly, the study found neither size nor frequency of data breaches to exhibit a time trend. Eling and Loperfido [50] use multidimensional scaling and goodnessof-fit tests to analyze the distribution of the data breach information. They show that modelling severity using a log-skew-normal distribution seems adequate and find that different types of data breaches need to be modeled as distinct risk categories.

3
A comprehensive model for cyber risk based on marked point… Eling and Jung [51] study the cross-sectional dependence of the data breach losses and identify a significant asymmetric dependence of monthly cross-industry losses in four categories by breach types as well as cross-breach type losses in five categories by industries. Farkas et al. [52] analyze heterogeneity of the reported cyber claims through the use of regression trees.
The second database was examined in Ref. [53], who focus on the theft of personal information and report a stable power-law tail distribution of personal identity losses per event. Wheatley et al. [54] combined data from both databases to focus exclusively on large breaches and study maximum breach sizes as well as severity distributions. The best fit is obtained by using a doubly truncated Pareto (Power law) distribution with linearly decreasing shape parameter for breach sizes, with sub-linear growth for the maximum log breach size.
Romanosky [55] uses a (commercial) dataset from Advisen, a US-based consultant to the insurance industry, with the aim of examining the composition and costs of cyber events. They conclude that firms may lack a strong incentive to increase their investment in data security and privacy protection and the primary motivation may come from the cyber insurance industry through its use of incentive-based premium reductions.
While the aforementioned papers mostly concentrate on data breaches, Eling and Wirfs [56] has a wider focus: they define cyber risk as a subgroup of operational risk and analyze cyber data from a large operational risk database (SAS OpRisk Global data), including a global range of cyber incidents that have occurred over an around twenty-year period and considering actual costs instead of number of affected records only. The frequency of losses is found to be most adequately modelled by a Negative Binomial distribution in a static approach, and a Poisson process with covariate-dependent rate in a dynamic approach based on Ref. [57]. For the loss severity, none of the canonical candidates (exponential, Gamma, log-normal, loglogistic, generalized Pareto, Weibull) were found to accurately model the entire loss data. Promising alternatives were a non-parametric transformation kernel estimation and an extreme value approach, where excesses over a threshold were modelled by a generalized Pareto distribution. The study highlighted the importance of distinguishing between cyber risks of daily life and extreme cyber risks.

Background on cyber insurance
Marotta et al. [25] provides an excellent summary of the past, present, and future of the cyber insurance market; as seen in 2017. They report an ongoing growth of available coverage, spurred by rising demand for insurance protection against cyber risks, which in turn is often caused by public coverage of severe cyber incidents [14,25,58], the introduction of stricter legislation across the globe [2,25], and firms' own loss experience [14,58]. In 2015, the global market for cyber insurance was estimated to be worth around $2 billion in premium, with US business accounting for approximately 90% . At the time, fewer than 10% of all companies had purchased cyber insurance, with typical buyers coming from industries holding large volumes of personal data, such as healthcare and retail, or relying on digitalized technology processes, such as manufacturing and telecommunications. A rapid market growth was projected, with total premium reaching $20+ billion by 2025 [4]. As of today, this estimate still seems realistic, with a global market size of around $7 billion in 2020 [59]. However, despite a strong growth and new insurance solutions being developed continuously, in 2017 the cyber insurance market in the US still had not reached the expected size predicted by optimistic forecasts [25]-thus the question of challenges inhibiting the market development arises.

Challenges and insurability of cyber risk
For the European market's supply side, ENISA et al. [13] identified the lack of solid data on losses, the fast pace of technology evolution, and the lack of adequate reinsurance among the key factors. Regarding the demand side, companies' most often mentioned reasons to refrain from purchasing cyber insurance include high prices [10,11,14,58,60], lack of availability of desired limits and coverage [14,58,60], concerns about numerous exclusions and restrictions [10], and lack of understanding about own exposure [61] or about policy offers [11].
A fundamental question is if, and under what circumstances, cyber risk is insurable at all, given its complex nature. ENISA et al. [13] first examined this question and concluded that cyber might well be an insurable type of risk fulfilling almost all of the considered desiderata. A more detailed analysis based on a dataset from an operational risk database was conducted in Ref. [62] and subsequently adressed in Ref. [63]. Their study identified the main problems to be lack of independence of loss occurrence, presence of information asymmetries, and lack of adequate cover limits. However, they remark that some problematic aspects might be alleviated in the future and thus advocate for systematic data collection, e.g. via platforms for data sharing organised by national regulators or international associations.

Cyber insurance policies: coverage and exclusions
Ignoring the academic question "to be (insurable), or not to be," in practice an immature cyber insurance market has developed and an increasing scope of cyber insurance products is available. The majority of coverage is offered as dedicated cyber coverage [11,14], with customers frequently shifting from endorsement to standalone policies [58]. The most sought-after types of coverage include cyber-related business interruption, data breaches, cyber extortion, and fund transfer fraud/social engineering [14,58]. Cyber policies typically cover the most common and costly incidents, including human error, mistakes, and negligence, external attacks by cyber criminals, system or business process failures, and malicious or criminal insiders. Rarely, however, attacks against business partners, vendors, or other third parties are included [10]. All policies generally distinguish between first and third party (liability) losses [15]. A systematic qualitative analysis of cyber insurance policies across the US [15] found a surprisingly strong similarity regarding covered losses, where the ten most commonly covered losses included costs of claims expenses (including legal expenses from penalties, defense, and settlement costs), public relations services, costs of notification of affected individuals, business income loss, data or system restoration, forensic investigation costs, and data extortion expenses. Romanosky et al. [15] points out that the top covered costs are cleanup costs, i.e. indirect costs in order to comply with laws, manage the firm's reputation, and reduce further expenses following a breach. Other studies found similar results for the covered types of losses (e.g. Refs. [2,10,16,25]).
Regarding exclusions, Romanosky et al. [15] found more variation between policies, where the most common exclusions stemmed from criminal, fraudulent, or dishonest acts, errors or omissions, intentional violation of a law, criminal investigations or proceedings, and payment of fines, penalties, or fees. Furthermore, hard-to-quantify costs like loss of employee productivity or brand damage are often excluded [10].
Lastly, an important issue to mention is non-affirmative or silent cyber cover, meaning that cover for cyber incidents may exist for example in traditional property and casualty policies, even though this was not the intention of the underwriter [12]. Misconceptions like this might lead to a dangerous perception gap for insureds [11] who suffer from an illusion of protection as well as insurers who might suffer from (unintentionally written) exposure to cyber risk.

Cyber insurance: risk assessment and pricing in practice
In the US, carriers typically assess an applicant's cyber risk through questionnaires, most of which emphasize the amount and type of data handled by the investigated company, whereas the technical infrastructure and IT security management receive less attention [15]. The sample questionnaire for risk assessment for cyber insurance by the German Insurance Association [64] differentiates between three risk categories primarily according to the annual turnover of a company and, secondarily, according to certain risky business units (e.g. e-commerce or handling of sensitive data), where the number of questions for a candidate increases with increasing risk category.
Regarding pricing, there seem to be large differences between carriers, while surprisingly, some of the recurring themes are reliance on external sources, estimation, comparison with competitors, using underwriter's experience, and adaptation of prices from other insurance lines [15]. Similarly, respondents in Refs. [14,58] stated that competition between carriers seemed to prevail over actuarial assessment of the cost of risk. Most examined policies in Ref. [15] multiply a base premium by variables relating to standard insurance factors and industry-related factors, where high hazard weightings are assigned to businesses that collect and store a high volume of sensitive data or operate in industries like retail, healthcare, and the financial industry. Finally, premium multipliers are commonly assigned according to the outcome of the questionnaire regarding IT security (e.g. privacy controls, network security controls, existence of an incident response plan). In conclusion, the impression manifests that while insurers are trying to get a better understanding of cyber risk and its drivers, due to the lack of ample reliable data to describe the problem with sufficient statistical precision, as of today pricing often happens on an ad-hoc basis and established quantitative models do not exist, yet.

The potential of cyber insurance: insurance as a service
While traditionally insurance is a means of risk transfer, cyber insurance can potentially offer more than compensation for monetary losses. Many insurers already advertise the services their cyber insurance policies include, e.g. prevention and incident response services or crisis communication support [65]. Moreover, ENISA et al. [13] highlights possible benefits of the development of a cyber insurance market such as the potential to incentivise firms to increase IT security through premium discrimination or the development of a market for security consulting firms that investigate security practices as part of the underwriting process.
Another future topic for insurers concerns arrangements and standards that facilitate sharing data and information about cyber incidents. In order to help corporations to overcome their resentments about sharing such data, it is the insurers' task to demonstrate that pooling data enables them to improve their range of services and design adequate new and transparent products that meet companies' needs [11].
Thus, despite most academic works concluding that in their theoretical frameworks cyber insurance cannot improve social welfare or network resilience, in practice the development of adequate, transparent cyber insurance products and services might entail a number of benefits transcending a mere possibility for companies' cyber risk transfer. In summary, during the last few years research on cyber risk has considerably increased and various aspects have been considered (disjointly). Our work focuses on the viewpoint of actuarial science, but we aim at providing a holistic modelling approach, taking into account both IT security and economic factors.

Cyber risk: a holistic view
Cyber risk as a multi-faceted and young risk still lacks an established definition in the (insurance) literature. We therefore introduce key characteristics and risk factors a cyber risk model should comprise.

Definition and key characteristics
Eling et al. [66] summarizes the origins, consequences, and key characteristics of cyber risk as follows: "Any risk emerging from the use of information and communication technology (ICT) that compromises the confidentiality, availability, or integrity of data or services. [...] Cyber risk is either caused naturally or is manmade, where the latter can emerge from human failure, cyber criminality (e.g. extortion, fraud), cyberwar, and cyber terrorism. It is characterised by interdependencies, potential extreme events, high uncertainty with respect to data and modelling approach, and risk of change." Interpreted from the actuarial perspective, the traditional approach of quantifying risk by frequency and severity of incidents, and combining them (potentially using an appropriate dependence structure) to obtain an aggregated loss distribution is complicated for cyber risk. We follow [25,63] in summarizing the central properties of cyber risk: • Absence of historical data: The novelty of this risk and the absence of an established terminology for cyber incidents makes it difficult for insurers to create a reliable database with information on losses. This is exacerbated by a reporting bias, i.e. companies are often reluctant to reveal incidents in order to avoid reputation damages. • Dynamic risk type: Cyber risk is as non-stationary as the underlying technology and legal framework, which makes the usability of past data for modelling future losses difficult. Among the main features that underscore the dynamic nature of cyber risk are the growing speed and scope of digital transformation, widening sources of vulnerability from hyperconnectivity, and the evolution of threat actors [2]. • Strategic threat actors: Cyber losses do not occur in a completely random fashion, as they are often caused by malicious actors with strategic (economic) motives and attack patterns. In 2018, Lewis [6] even described the trend of cybercrime as a service (CaaS) encompassing a large diversity and volume of cybercrime offerings, including products (e.g. exploit kits, custom malware) and services (e.g. botnet rentals). Around this, a thriving cybercrime economy has emerged from the related communities, offering for instance product development and technical support. • Interdependence/Accumulation risk: The interconnectedness of IT-systems and the often systemic nature of vulnerabilities induce a dependence structure within and across company networks and the potential for loss accumulation. • Interdependence of security: Another result of the network interdependence are negative externalities regarding security, which within a game-theoretical context might lead to an equilibrium in which all companies underinvest in security and, therefore, the overall network is not sufficiently protected. • Difficult impact determination: Due to the intangible nature of information assets, it is often difficult to quantify the economic consequences of a cyber incident. • Information asymmetry: Cyber insurance exhibits two sorts of information asymmetry: Adverse selection and moral hazard. The former refers to the challenge for an insurer to reliably determine a company's risk exposure, the latter refers to the difficulty of ensuring the risk exposure to be maintained throughout the entire contract period.
As we focus on actuarial questions, we refrain from considering in more detail technological aspects of information security, the economics of cyber security and cybercrime, or the legal framework. However, one important aspect to be mentioned concerns the role of governments and legislation. For example, in the European Union, the General Data Protection Regulation (GDPR) came into force on May 25, 2018 with fines up to 20 € million or 4% of annual global revenues, emphasizing that the respective legal framework must be considered when modelling the size of cyber insurance claims, as penalties and fines may be included in the coverage. Furthermore, besides setting the legal framework, Anchen [2] argued that the government could help to promote cyber resilience by reshaping incentives and increasing awareness of cyber threats.

Cyber risk factors
So far, the term risk was used informally. Going further, we disintegrate risk into a combination of threat, vulnerability, and impact (c.f. Ref. [25]). A threat is the underlying root cause of the risk, which itself does not necessarily manifest as an incident, but is only harmful if there is a corresponding vulnerability in the target system. If a threat and an existing vulnerability lead to the occurrence of an incident, the impact refers to the consequences, which can be tangible (e.g. direct financial consequences) or intangible (e.g. loss of reputation). The process of risk management classically consists of identifying risks by characterising threats, vulnerabilities, and impacts, analysing risks with regards to the probability and impact of an incident and treating the estimated risks by selecting and applying adequate measures. As outlined in Ref. [25], there are four classical ways of dealing with risks: risk reduction, risk transfer, risk avoidance, and risk acceptance. Clearly, cyber insurance is a tool for risk transfer and a potential incentive for risk reduction.

Threats
In order to assign cyber incidents to a few distinct classes, we recall a quite concise definition of cyber risk originally motivated by the study of operational risk management, namely "operational risks to information and technology assets that have consequences affecting the confidentiality, availability, or integrity of information or information systems." [38].
We follow [37] in applying this definition to classify cyber incidents according to three classical information security protection goals: confidentiality, integrity, and availability of information assets [67]. Table 1 gives an overview of their definitions and the incident types that compromise each goal.
Of course, these categories are not mutually exclusive; an example combining features of fraud and business interruption is a Ransomware attack, i.e. extortion for temporarily withheld data. We nevertheless implement the above distinction, as it is known from data breaches that incidents of different kinds typically show a different statistical nature [68] and, moreover, the economic consequences vary across the incident categories [55]. An incident falling into more than one class could e.g. be assigned partially to both of them according to the losses it entails, e.g. for the Ransomware case, losses from the interruption of operations under BI and losses from ransom payments under FR. Furthermore, we can understand FR as a general class of incidents that cannot be distinctly classified as DB or BI. Table 1 Classification of three main types of cyber incidents according to the basic information security protection goal they compromise Type of cyber incident Abbreviation Information security goal compromised by this incident type Definition of information security goal [67] Data breach or data loss Until a few years ago, data breaches have been the most observed type of incident, see Sect. 2.1. Recently, however, the potential impact from cyber-related BI has become a major concern [4] whose financial consequences could equal or surpass losses from a data breach. Vice versa, cyber incidents have become the most feared BI trigger [12]. Our classification is quite similar to the definition of a cyber attack used in Ref. [13] which, however, only focuses on malicious cyber attacks. In our view, to capture the whole scope of cyber incidents, a second distinction along another dimension, namely the root cause, should be made. 5 Here, the following should be distinguished: • Targeted attacks: Malicious attacks that target one firm specifically due to its characteristics and assets. Usually, the attack vector is tailor-made to circumvent the company's defense strategies. • Individual failures: Non-malicious incidents at single firms that happen due to internal or external machine or system malfunction or human error. • Untargeted attacks: Malicious attacks (from an external source) that do not target one firm specifically because of its characteristics, but are opportunistic in the sense that they attack many available targets-usually simultaneously. • Mass failures: Non-malicious events that affect multiple entities simultaneously, such as the failure of a cloud service provider.
Combining incident types and root causes yields the partition of cyber incidents as shown in Table 2. 6 Note that we use terminology that is common in the natural catastrophe context and is applied in the cyber context in Ref. [70]: an incident refers to a single loss, whereas an event can cause many related incidents. 7 One can further scrutinize motives of individuals or groups for targeting companies via cyber attacks. CRO Forum [69] defines five types of threat actors (with corresponding motivation): nation states (strategic), organised criminals (economic), hackers (reputational), hacktivists (political), and insiders. The last group includes unintentional insiders, emphasizing that, although malicious attacks are more publicly present, a large share of cyber incidents stems from human error or technical 1 3 A comprehensive model for cyber risk based on marked point… problems. This applies to cyber-related BI [4,12], data breaches [5,12], and general cyber incidents [3,55,56]. One peculiar type of targeted attacks not yet explicitly mentioned are so-called supplier attacks, where a company is not attacked directly but through attacks on supply-chain partners (with potentially weaker defenses). Although this so far only accounts for a minority of incidents, studies have identified a trend of attackers slowly shifting their attack patterns to exploit supply chain partner environments, particularly for industries with mature cybersecurity standards [3], and thus many companies will increasingly seek to extend insurance cover to their supply chains [4].

Vulnerabilities and controls
Threats only manifest as successful incidents if there exists an exploitable vulnerability in the target system [36]. We distinguish between symptomatic and systemic vulnerabilities (c.f. Ref. [36,71]), where the former only affect single firms (e.g. via custom software), while the latter can affect many firms simultaneously (e.g. via a vulnerability in standard software). Especially the second kind is worrisome, as it exposes many potential targets to the same threat and thus could lead to highly correlated and simultaneous losses [36]. From the viewpoint of a company, a vulnerability can be mitigated by establishing adequate controls, both technical (e.g. antivirus software) and non-technical (e.g. awareness campaigns [36]).
Investments in cyber security require strategic decisions and cannot be limited to the prevention of cyber incidents, but must also take into account the discovery, investigation, and containment of an attack and the fast recovery of systems to a working state [3]. Many academic works have studied the problem of finding an optimal security level, balancing the cost of controls against the benefits from reduced losses (see Sect. 2.1). We do not further study this problem here, but rather conclude that a firm's IT security level must be a central parameter for an insurance company's risk assessment (as it already is in practice [15]).
Besides opportunistic attacks that stem from the opportunity of exploiting an existing vulnerability, we also consider targeted attacks on a specific victim. Thus, further firm characteristics that incentivise such attacks need to be identified. Considering the list of threat actors in the previous section, the following characteristics arise: • Industry sector: Previous studies indicate that both the number and cost of cyber incidents depend on the industry [3,5,51,52,54,55], with regulated industries such as healthcare and financial services suffering most. Wheatley et al. [54] mention that the industrial sector as a risk factor may serve as a proxy to identify relatively homogeneous subgroups of companies with respect to their frequency of interaction with consumers and the total volume of personal data they guard. • Data: It is intuitive that indeed the amount and sensitivity of data handled by the company is a factor, as especially actors with economic motives will target companies with a high amount of valuable data in order to maximize their economic gain. In practice, this is already incorporated into insurance pricing via hazard weightings [15]. • Company size: Regarding the size of a company, there are different aspects to be considered: Large, publicly known companies are prime targets for threat actors with reputational motives, whereas SMEs are often worse protected due to budget constraints or their smaller awareness for cyber risks.
Eling and Wirfs [56] considered, among others, the company-specific covariates industry sector and size and found both of them to be highly significant for the frequency of all kinds of cyber incidents in their model.

Impact
Parallel to increasing occurrence rates, the economic consequences of various cyber incidents have recently become more severe, with BI and information loss having the highest monetary impact [3]. For data breach incidents, the average cost could be up to several million USD [5,10], where the biggest financial consequence is found to be lost business. Quantifying the consequences of cyber incidents is difficult due to the scarcity of historical data and the various (intangible) types of costs. Nevertheless, earlier studies give some indications of cost drivers. For data breaches, Ponemon Institute LLC [5] find the average cost per record to depend on the root cause (malicious attacks vs. system failures and human error) and the industry sector. The latter could be explained by the fact that the rate of lost customers and business depends on the industry, but also by considering the impact of regulation and litigation on breach cost causing highly regulated industries to suffer larger losses [12]. An effect of the company size on the breach cost was reported in Refs. [53] and [55], who developed a model for the log-cost of a data breach depending on the firm's revenue (as a proxy for size) and the number of compromised records. This is more comprehensive than the well-known Jacob's formula [72], which simply links the log-cost of a data breach to the (log-)number of compromised records. Another amendment was proposed in Ref. [52], who argue that [72] did not yet take into account the cost of mega data breaches observed in future years. Finally, adequate controls can not only decrease the probability of a breach, but also its potential consequences: Improvements in data governance programs, presence of incident response plans, and employee training all result in average cost savings in case of a breach [5,60].
Concluding, there is evidence that for data breaches the cost of an incident depends on the industry sector, the size of the company, the amount of data affected, potentially the type of attack, and controls in place. The statistical findings and distributions used to model the severity of data breaches found in investigations of available databases have been summarized in Sect. 2.1. Note that these findings for data breaches might not necessarily translate to the other incident types, as different types of cyber incidents (e.g. data breaches and privacy violations) are found to display large median cost differences [55].
It is hard to find information on the economic impact of the other two types of incidents studied here, namely BI and fraud. For the former, some sources from the non-cyber domain are available [73][74][75][76][77]. The only sources including indications of which distributions are useful to model economic loss from BI are [77], who finds that the size of yearly BI insurance claims follows a Pareto distribution with an extremely heavy tail and infinite expected claim size, and [75], who suggests modelling BI loss by a PERT distribution.
For fraud one might hope to find information in studies on the cost of cyber crime but, unfortunately, the data is usually either aggregated over all types of cyber incidents from malicious sources or focuses on information loss/theft. Thus, there is very little reliable evidence on the actual cost of cyber fraud. Despite indications that cyber risk is quite different from other types of operational risk [56], one way might be to draw on knowledge about the modelling of operational risk as, e.g. the Basel II framework [78] includes internal fraud and external fraud as event-type categories.
Another option is to refer to the recent work of [56], who study all kinds of cyber incidents (including data breaches as a subset) using a model where the parameters of the distribution of both frequency and severity of cyber incidents might depend on firm-specific and incident-specific covariates as well as time. They resort to an EVT approach to model the severity of events, using the generalized Pareto distribution (GPD) to model excesses over a high threshold (the tail of the distribution) and a series of simple parametric distributions (e.g. exponential, Gamma, log-normal) for the body. The GPD with shape parameter and scale parameter is of the form for x ≥ 0 if ≥ 0 and x ∈ [0, − ∕ ] if < 0 . They build on the work of [57] to fit a model where the parameters of the GPD may depend on covariates (including time). To the best of our knowledge, their work is the first to model the actual economic loss and to consider general types of cyber incidents instead of only data breaches, thus we incorporate their approach in our framework.

Properties of a cyber risk model
Before proposing a model for cyber risk, we shortly summarize the properties/stylized facts it should possess given the findings from this chapter: A comprehensive model for cyber risk based on marked point… teristics are the industry sector, the company size, the data handled by the company, and its IT security level. • The model should be able to capture the dynamic nature of cyber risk, as occurrence rates as well as impact of cyber incidents may change over time.

Actuarial model
Considering cyber risk a combination of threat, vulnerability, and impact, from an actuarial viewpoint it remains to translate this understanding into modelling frequency and severity of losses within a portfolio. After doing so in Sects. 4.2 and 4.3, in Sect. 4.4 we address actuarial questions, before illustrating the model in a simulation study in Sect. 5. Note that the proposed model is purposefully constructed in a modular way, as some of the assumptions and parameter choices might be updated in the future once suitable data is available. Moreover, a user who wants to incorporate properties of an internal data set can refine individual features of the model (or replace parts) without changing the overall structure.

Insurance portfolio
Consider K firms, labeled {1, … , K} , constituting the portfolio of an insurance company exposed to losses due to cyber incidents (idiosyncratic or caused by systemic events). This typically refers to losses covered by stand-alone cyber policies, but might in some cases include losses that still fall under traditional policies for some insurers (note that a trend towards the elimination of cyber exposure in traditional business is observed, hopefully leading to a clear-cut distinction in the future). Following the findings of Sect. 3, we assume that for each company included into the insurer's portfolio, information about relevant covariates is collected via a questionnaire and public information. Table 3 gives an overview of the characteristics we identified as relevant, the potential to elicit the necessary information from public data or a firm's voluntary disclosure, and a suggestion for their inclusion in a mathematical model. 8 Thus, for each firm j ∈ {1, … , K} , the vector of covariates is known, yielding a K × 5 covariate matrix 8 We do not claim this list to be exhaustive but stress that all required information can be objectively collected by an insurer. For example, one could argue that for an insurer with a world-wide portfolio, information on a firm's location (jurisdiction) should be added as it might influence the severity (e.g. via fines to be paid following a data breach) as well as, for targeted attacks, the frequency (as data from some countries might be more valuable and therefore a more frequent target) of cyber losses.
where each row corresponds to one firm in the portfolio and each column to one of the above covariates in the order given in Eq. (1). We assume for notational convenience that all firms are ordered by sector, i.e. suppose there are B sectors and K̂b,b ∈ {1, … , B}, such that firms in sector b exactly correspond to indices

Loss frequency
We will use the framework of point processes to model the arrival of cyber incidents. This allows to naturally incorporate time-and covariate-dependence of the incident frequency and to distinguish between different types of incidents. A comprehensive overview on point processes is given in Refs. [79,80], whose notation we use. In the following, all random variables are defined on a suitable probability space (Ω, F, ℙ) , where Ω is the state space, F a -algebra on Ω , and ℙ a probability measure on (Ω, F) . For our purposes, we focus on simple point processes on the non-negative real line, i.e. processes on the state space [0, ∞) interpreted as time, whose corresponding counting process (N(t)) t≥0 = |{i ∈ ℕ ∶ t i ∈ [0, t]}| t≥0 has unit increments, where | ⋅ | denotes the cardinality, i.e. the number of elements, of a set.
We recall Table 2 for a classification of cyber incidents according to their incident type and root cause: Idiosyncratic incidents (targeted attacks and individual failures) are discussed in the next section, systemic events (untargeted attacks and mass failures) are addressed subsequently.
9 Ties can be ordered arbitrarily.

Idiosyncratic incidents
Idiosyncratic incidents are assumed to occur at each firm independently (from incidents at other firms as well as between types of incidents at the same firm). For these types of incidents, we assume that any incident is successful in the sense that it breaches the firms IT security measures and causes a loss. This is reasonable, given that targeted attacks are usually tailor-made against one company and furthermore, the majority of non-successful incidents (near misses) of this type might not be monitored or recognized. 10 We assume the arrival of such incidents of each type {DB, BI, FR} at each firm j ∈ {1, … , K} to follow an inhomogeneous Poisson process with time-and covariate-dependent rate where the super-/subscript ⋅ stands for one of the incident types ⋅ ∈ {DB, FR, BI} , the functions f ⋅ additively map (a relevant subset of) the covariates, i.e. f ⋅ (x) = ,⋅ + ∑ k f ,⋅,k (x jk ) for some constant ,⋅ and g ⋅ ∶ [0, T] → ℝ is a measurable function describing the time dependence. The explicit form of the functions f ⋅ and g ⋅ is of course unknown but can be estimated from a suitable data set. 11 The dependence on covariates and time can differ for the three incident types. As an example, if one assumes the rate of data breaches to depend on the covariates x j3 (data; for targeted attacks), x j4 (IT security; for failures), and x j5 (number of suppliers; for supplier attacks) only, this would yield where the functions f ,DB,k map factor levels to constants for the ordinal covariates indexed k ∈ {3, 5} (i.e. are naturally measurable) and f ,DB,4 is any measurable function of the numerical covariate x j4 .
It is clear that for any interval [ 1 , 2 ] ⊆ [0, ∞) (set for now 1 ∶= 0 and 2 =∶ T ), given the covariate matrix , the number of idiosyncratic incidents of each type arriving at firm j follows a Poisson distribution: 4,5 f ,DB,k (x jk ) + g DB,idio (t), 10 As we assume these attacks to occur due to the firm's characteristics, one might ask if a firm has to simply take its exposure to these types of threats as given. For the occurrence rate of malicious targeted attacks this might be true, but we assume that the impact of a successful attack can be limited by adequate measures (see Sect. 4.3). Furthermore, putting security measures in place mitigates the occurrence of individual failures and potentially implicitly deters targeted attacks as attackers would have to invest more resources to devise an attack vector. 11 As this ansatz constitutes a standard generalized additive model, techniques for parameter estimation are readily available, see, e.g. Ref. [81]. Using the statistical software R, such models can be fit with the function gam(...,family=poisson) from the package mgcv.
where Λ ⋅,idio (t)dt is the mean measure of the inhomogeneous Poisson process.
As the processes are assumed independent between firms, it follows by superposition that the number of idiosyncratic incidents on the whole portfolio is also Poisson distributed:

Systemic events
Systemic events cause incidents at multiple firms at the same time and, if of malicious origin, are typically of an opportunistic nature, i.e. a set of firms is affected not because of their specific features or the economic gain attainable from attacking them, but rather due to the availability of an exploitable attack vector against them. This often stems from a common vulnerability, for example a list of bought e-mail-addresses that allows a threat actor to send ransomware to employees of certain firms. In many cases, a common vulnerability would likely affect firms within one industry sector (e.g. if custom software is vulnerable), but of course the common factor can also be unobservable. In any case, to model incidents from systemic events, an extension of the simple point process framework of the previous section is needed. We use the framework of marked point processes, where the process of locations (arrival timepoints of events), now called the ground process N g (⋅) , is a simple 12 point process {t i } i∈ℕ on the non-negative real line as above, more specifically a non-homogeneous Poisson process with log-rate 13 where the super-/subscript ⋅ indicates the event type ⋅ ∈ {DB, FR, BI} , and again g ⋅ is a measurable function of time. Each arrival of the ground process {t i } i∈ℕ is then (4) log ⋅,g (t) = g ⋅,g (t),

Remark 1 (Interpretation of mark components)
1. m i ∈ M = [0, 1] describes the strength of an event, where strength can be understood e.g. as effectiveness to overcome IT security measures. 14 This is useful to include, as in reality a wide range of sophistication of attacks exists and capturing their strength allows to quantify the effectiveness of IT security measures and the sensitivity of the expected loss to their improvement. 2. S i ∈ S = P K encodes the subset of the portfolio affected by an event.
The two components of the mark are used jointly to determine which firms suffer a loss from a given event, namely those firms included in the affected subset whose security level is lower than the strength of the event. With the above notation, an event t i , (m i , S i ) � -arrives at time t i , -reaches exactly the firms j ∈ S i , and -causes a loss in exactly the firms j ∈ S To characterize a marked point process completely, it remains to specify the conditional distribution of the marks, given the locations of the Poisson ground process N g (see Proposition 6 in Online Appendix A.1). This is done in the following assumptions whose rationality will be detailed below: A concrete distributional assumption for a model should ideally be backed by empirical evidence. As this is currently not possible due to data scarcity in the cyber domain, we stick to the principle of imposing as little (unknown) prior information as possible. This justifies (A1) and (A2), as we do not have any evidence that would suggest deviating from iid., to introduce any particular dependence, neither between locations and marks, nor between the components of the mark. Similarly, regarding (A3), one might intuitively rather assume the number of very weak attacks (such as easily recognizable spam e-mails) to be higher than the number of very sophisticated attacks. However, as we do not have statistical evidence that would allow to choose a particular distribution, we use a Uniform distribution (maximum entropy distribution among all continuous distributions on a bounded interval [82]). Considering (A4), several industry experts have highlighted in conversations with us the importance of industry sector-specific systemic events. Thus, we incorporate this idea in our model, while again leaving the distribution as simplistic as possible (conditionally iid. Bernoulli draws). Furthermore, note that due to the modular structure of the model, each assumption can be altered or replaced individually if suitable data indicates the necessity, without compromising the general model structure.

Properties of the model
In the following, we detail properties of the model and their interpretation in the cyber insurance context. As proofs mostly rely on standard techniques, they are given in Online Appendix A.3.  Notice that the distribution of {|S * i |} i∈ℕ implies the distribution of {|S i |} i∈ℕ as the special case where c [k] = 0, ∀k ∈ {1, … , K} , i.e. the worst-case scenario where no firm has any IT security measures in place and thus the number of incidents and losses is equivalent. Proposition 2 (Conditional incident and loss probability) For a firm j 1 ∈ {1, … , K} in sector b j 1 , the probability of being affected by an event, given the information that another firm j 2 ∈ {1, … , K} in sector b j 2 has been affected (i.e. the conditional incident probability), is given by

Proposition 1 (Distribution of number of incidents and losses) Under (A4), the number of incidents per event {|S
where is the (unconditional) incident probability for each firm, given its industry sector.
Likewise, for the conditional loss probabilities, where the unconditional loss probability is given by The above results are interesting from a practical viewpoint: If an insurer is notified about a cyber incident by one of its policyholders (many policies include mandatory immediate notification or even the provision of an immediate-response-team by the insurer), it is worthwhile to find (and warn!) firms with a high conditional probability of having been affected by the same event, thus potentially giving them the chance to avert the actual manifestation in their firm (e.g. by warning employees about a phishing threat or updating vulnerable software). The information about an incident in one firm always has a non-negative effect on the incident probabilities for other firms of the same sector ((7a) vs. (8); a formal proof of this statement is given in Online Appendix A.3), while the effect can go in either direction for firms of different sectors ((7b) vs. (8)). For a detailed illustration, see Fig. 1. The same holds for the information of a suffered loss, i.e. the probability of suffering a loss increases with the knowledge that another firm of the same sector has suffered a loss, and the increase is larger if the harmed firm's IT security level exceeds the one of the firm under consideration.
Analogously to the notation in the previous section, for any interval [0, T] ⊆ [0, ∞) , given the arrival process t i , (m i , S i ) � i∈ℕ and the covariate matrix , the number of incidents N ⋅,syst j resp. losses N ⋅,syst j at each firm follows a Poisson process, where the rate can be obtained by thinning the ground process N ⋅,g of arrivals {t i } i∈ℕ appropriately (see Ref. [83] and Proposition 5 in Online Appendix A.1). In particular Contrary to the previous section, we cannot transition to the portfolio level by simple superposition due to lack of independence between firms. Instead, we express the cumulative number of incidents N ⋅,syst (T) resp. losses N ⋅,syst (T) across the entire portfolio for fixed T > 0 as a compound Poisson distributed r.v.
where N ⋅,g (T) ∼ Poi ∫ T 0 ⋅,g (t)dt and {|S i |} i∈ℕ resp. {|S * i |} i∈ℕ are iid. mixed Binomial and independent from N ⋅,g (T). 15 Using well-known results for the calculation of the expectation and variance of a compound Poisson r.v. (details in Refs. [84,85] and Online Appendix A.2), this yields:

Proposition 3 (Overdispersion of systemic incident/loss numbers) Assume K > 1
and K̂b > 1 for at least one b ∈ {1, … , B} with p̂b > 0. Then, the cumulative number   Fig. 1 We illustrate the effect of the information that a firm of the same (resp. a different) sector has been affected by an event on the incident probability for just two sectors. (a) shows the (un) conditional probabilities for a firm from sector 1 dependent on p 1 . The other parameters are chosen as p G = p sec = p gen = 0.5 (thin lines) or p G = 0.5, p sec = 0.2, p gen = 0.8 (thick lines), such that one observes that for b j 1 ≠ b j 2 , the conditional probabilities can be above or below the unconditional one (solid line), whereas for b j 1 = b j 2 , the conditioning has a non-negative effect in both cases. Likewise in (b), for p G = 0.5, p gen = 0.5, p 1 = 1 − p 2 = 0.75 , probabilities for all cases are shown dependent on p sec . Observe again that conditioning on the same sector has a non-negative effect, whereas when conditioning on the other sector, there is an intersection ℙ(j 1 ∈ S i ) = ℙ(j 1 ∈ S i | j 2 ∈ S i ) (and ℙ(j 2 ∈ S i ) = ℙ(j 2 ∈ S i | j 1 ∈ S i ) ) at p sec = 0.4305

of incidents resp. losses from systemic events is overdispersed, i.e. has a dispersion index (DI = variance-to-mean ratio) exceeding 1.
It is well-known that the Poisson distribution exhibits equidispersion ( DI = 1 ), while empirical studies in non-life insurance often report overdispersed claim count data and therefore recommend to use alternative distributions (e.g. negative Binomial) to model claim counts. Proposition 3 shows that our construction using a marked Poisson process to allow simultaneous arrivals of several incidents (resp. losses) due to one systemic event likewise is able to introduce overdispersion.

Summary: loss frequency model
Recall that idiosyncratic incidents arrive to each firm j ∈ {1, … , K} independently as inhomogeneous Poisson processes with rates ⋅,idio j (t) , ⋅ ∈ {DB, FR, BI} . Each incident is assumed to cause a loss. For a fixed time T > 0 , the overall number of losses caused by idiosyncratic incidents up to this time N ⋅,idio (T) follows a Poisson distribution with rate Λ ⋅,idio (T) = ∫ T 0 dt.
Systemic events arrive to the portfolio with overall rates ⋅,g (t) . Each arrival t i carries a mark including the strength of the event m i and the affected subset S i . An event at time t i thus causes incidents in all firms in the set {j ∈ S i } and causes losses in its subset {j ∈ S * i } = j ∈ S i , c j < m i . The total number of incidents and losses from systemic events up to time T > 0 , N ⋅,syst (T) resp. N ⋅,syst (T) , follow a compound Poisson distribution with mixed Binomial jump sizes.
Aggregating the number of incidents and losses from both root causes on the level of each individual firm translates to aggregating two independent Poisson r.v.
On the portfolio level, we aggregate two independent compound Poisson r.v. (one with jumps of constant size 1 and one with mixed Binomial jump sizes), which yields (see Proposition 7 in Online Appendix A.2): where N(T) ∼ Poi Λ ⋅,idio (T) + Λ ⋅,g (T) and {Y i } i∈ℕ are iid., independent of N(T), with mixture distribution Note that so far, we have kept the numbers of losses of different types {DB, FR, BI} separate. In general, assuming independence between them, they could be aggregated into one arrival process, but this may not be desirable as also the loss severity distributions might be different (see Sect. 3.2.3), and therefore for the determination of the portfolio loss their numbers have to be taken into account separately.

Loss severity
After describing the model of the cumulative number of cyber incidents and losses in the last section, we now turn to their impact, i.e. let L ij ∶= L j (t i ) be a r.v. describing the non-negative monetary loss caused by a cyber incident reaching firm j ∈ {1, … , K} at time t i ∈ [0, T] . Based on previous findings from academic literature and the arguments in Sect. 3.2.3, we model the body and tail of the loss severity distribution separately and allow the parameters of the distributions to exhibit timeand covariate-dependence. Specifically, for all types of incidents, we suggest using a combination of log-normal and generalized Pareto distribution based on the findings of Ref. [56]. Other promising approaches (e.g. based on Refs. [49] and [72] for DBs or based on Ref. [75] for BIs) are detailed in Online Appendix A.4.
As we do not rely on empirical data, we first need to set a threshold between body and tail of the to-be-constructed distribution. Therefore, we first assume an underlying log-normal distribution L ⋅ ij ∼ LN( ⋅ ij , ⋅ ) and select a high quantile as threshold, e.g. set u ⋅ ij = q z (L ⋅ ij ) with e.g. z = 0.95 . Given the threshold, construct the density f L ⋅ ij of the loss distribution as 16 where TruncLN( , , x min , x max ) denotes a truncated log-normal distribution on the interval [x min , x max ] and GPD(u, , ) denotes a generalized Pareto distribution with (13) location u, shape , and scale . 17 As for the idiosyncratic frequency modelling illustrated in Sect. 4.2.1, the sum might run over different subsets of covariates for different incident types (see e.g. Eq. (3)). Next, we combine the concepts for frequency and severity modelling to study some questions that arise from an actuarial viewpoint.

Insurance pricing and risk measurement
Recall that we take the perspective of an insurer, whose portfolio consists of K firms exposed to cyber losses whose frequency and severity are modelled as detailed in Sects. 4.2 and 4.3. Questions of interest for the insurer when setting up a portfolio of cyber insurance policies typically include: 1. Contract design (deductibles, cover limits, coverage period); 2. Pricing of individual policies given an applicant's characteristics; 3. Estimation and quantification of the portfolio risk.
In this work, we do not elaborate in detail on the first question and for now assume no deductible, no cover limit, and a standard policy duration of one year. These assumptions imply that for each incident, the loss suffered by the insured firm and the claim size faced by the insurer are equal and the terms will be used interchangeably. To study the latter two questions, the total claim amount process is denoted where it is assumed that the claim number process (N(t)) t≥0 is independent of the iid., a.s. positive, claim size sequence {Y i } i∈ℕ . We restrict our focus to the case of fixed T > 0 , i.e. instead of studying the process (L(t)) t≥0 , study the random variable L(T). In general, it is very hard to make statements about the exact distribution of L and one has to resort to Monte Carlo methods or, if applicable, a numerical routine like the Panjer recursion.
In our context, the loss for a firm j ∈ {1, … , K} up to time T > 0 from one type of cyber incidents (e.g. data breaches) can be expressed as 17 Note that when fitting a GPD with covariate-dependent parameters using the method developed in Ref. [57], an orthogonal reparametrization ( (x j , t), (x j , t)) ∶= ( (x j , t), log( (x j , t)(1 + (x j , t)))) is chosen. The resulting MLE ̂ can be transformed back directly to an estimator ̂ , but the dependence of on the covariates does then not follow a GAM structure anymore. Therefore, a more general functional relationship is stated above.

3
A comprehensive model for cyber risk based on marked point… where N ⋅ j (T) ∼ Poi Λ ⋅ j (T) as given in Eq. (11) and {L ⋅,(j) as given in Eq. (13). Note that in general the sequence {L ⋅,(j) i } i∈ℕ is not iid. (due to time-dependence). However, if we drop time-dependence, which in practice could mean assuming constant severity distributions on one-year intervals considered separately, L ⋅ j (T) is again compound Poisson, and the total cyber loss incurred by firm j ∈ {1, … , K} is given by The portfolio loss is simply given by the sum of (dependent) firm losses, i.e.
Regarding the second question of finding a premium Π(T) for an individual insurance policy on [0, T], we recall the well-known premium calculation principles listed below [86]. As it is often impossible to find the exact distributional properties of the total claim amount process, the ones based on the first two moments are popular in practice. Concerning the question of quantifying the risk of the overall portfolio loss, the two most common tail risk measures are the Value-at-Risk (VaR) at a given confidence level 1 − and, if applicable, the corresponding Average Value-at-Risk (AVaR). Theoretically, for a positive loss r.v. L with cdf F L , they are given by where F −1 L denotes the generalized inverse of F L and ( * ) requires F L to be continuous. Note that in cases with very heavy-tailed loss severities (as e.g. observed in some of the previous works on cyber risk), AVaR(L) cannot be computed as it relies on L to have finite expectation. .

3 5 An example of an actuarial application via a simulation study
The aim of the following section is to illustrate the application of the proposed modelling approach to pricing and risk measurement in an actuarial context. To this end, a fictitious insurance portfolio is constructed and parameters for the frequency and severity distributions as given in the previous sections are proposed based on previous academic literature and expert judgement. Based on the resulting simulated portfolio loss distribution, the effect of interdependent losses and the introduction of cover limits is highlighted. Due to the scarcity of available empirical data, the parameters and model assumptions could not yet be fit to (resp. challenged on) a real dataset; this remains an important task for future research.

Portfolio composition and company covariates
We first construct a (fictitious) insurance portfolio consisting of K = 50 firms from B = 6 sectors, all details are listed in Table 3. A bigger portfolio, which is used in our simulation study, is then obtained by copying each firm 10 times with IT security levels varying from 0.05 to 0.95 (stepsize 0.1). This enables us to compare the results of the entire portfolio ( K = 500 ) with sub-portfolios ( K = 50 ) of different security level (denoted sub-portfolio 1 -10), and for each individual firm with varying security level. Table 4 gives an overview of the relative and absolute frequencies for each covariate in each sub-portfolio.

Frequency distribution
We require our simulation to adhere to the following stylized facts (F1)-(F5) for the frequency of idiosyncratic incidents: (F1) Consider a T = 5-year observation period, during which the frequency increases by around 67% [3]. The increase is realized in yearly (log-linear) steps; within each year the frequency is assumed constant. (F2) During the first year ( t ∈ [0, 1) ) and for baseline covariate levels s j = d j = nsup j = 1, c j = 0.5 , the incident (loss) probability is 0.01 (this is a conservative estimate).
(F4) An increase of either of the categorical covariates s j , d j , and nsup j by one (two) level(s) from the baseline implies an increase of the incident rate by 10% (20%). (F5) Assume a log-linear influence of the security level such that increasing it to the maximum ( c j = 1 ) yields a halved rate of cyber incidents (and thus lowering it to the minimum ( c j = 0 ) leads to a doubled rate), compared to the baseline.
These assumptions imply the parameters for the covariate-dependent rates of idiosyncratic incidents (c.f. Eq. (2)) in the upper panel of Table 5. 18 For systemic events, we follow Sect. 4.2.2, where the assumed parameters for the ground process (c.f. Eq. (4)) and the mark distribution are given in the lower panel of Table 5. Although it is difficult to make assumptions, as none of the existing studies explicitly distinguish systemic events, Table 5 reflects the following simplifying assumptions (F6), (F7) (as before, we do not have any information to justify any more complex assumptions): (F6) The mark distribution is equal for DB, BI, and FR. Sector-specific events are (discretely) uniformly distributed over all sectors. (F7) The number of incidents from systemic events is similar to the number of idiosyncratic incidents for baseline covariate levels, which implies a doubled overall incident frequency (and a 50% increased loss frequency).

Severity distribution
For this study, we deviate from the very high mean (resp. median) severity estimates given in the existing literature (several million US$ for a single incident) for two reasons: First, it is reasonable that events listed in public databases exhibit much higher losses than the average daily-life cyber incident that goes unnoticed by the public and second, insurance policies currently offered on the market (especially policies for SMEs) usually have cover limits of up to 5 million US$ , therefore it would not be reasonable to assume mean claim severities that already exhaust the policy limit. 19 Recall that this study is intended as a prototype to show the general behaviour of the model; absolute numbers given should not be interpreted as representative of a real-world portfolio. The following assumptions (S1)-(S7) lead to the choice of parameters given in Table 6: (S1) During the first year and for baseline covariate levels, for all types of incidents the expected claim size of the underlying log-normal distribution is given by The standard deviation of the underlying log-normal cost distribution is constant and consistent with the results for (negligent) data breaches in Refs. [49] and [72]. 1 [56] to model heavy-tailed behaviour while avoiding a switch from a finitemean to an infinite-mean scenario. (S6) The expected threshold exceedance (relative to the corresponding threshold, dependent on the underlying log-normal distribution) ) −1 equals 0.5 for baseline covariate levels, i.e. the expected size of a claim exceeding the threshold is given by 1.5 times the threshold. (S7) The same assumptions regarding covariate-and time-dependence as for small claims are made, referring to the expected relative threshold exceedance (e.g. a one-level increase of s j leads to a 10% increase) instead of the expected claim size. In this case, the influence of c j is linear and such that c j = 1 results in a halved expected relative threshold exceedance. 20

Results of the simulation study
The following results are based on 50.000 simulation runs on a grid of 5 years, reported values refer to the first year unless stated otherwise. For each run, the arrival times of idiosyncratic incidents (at each firm) and systemic events are generated using the rates in Eqs. (2) and (4) ) with coefficients given in Table 6 which do not strictly fit into the framework of [57] for fitting a covariate-dependent GPD. When calibrating the model to data, it is not required to make any such assumption. Note, however, that due to the reparametrization in the framework of [57], the covariate dependence of is not intuitive. Therefore, we stick to intuitively interpretable assumptions.

3
A comprehensive model for cyber risk based on marked point… Table 3 List and modelling framework of company covariates to be included in a cyber risk model Covariate  [52,55,72]. Usually considered for insurance pricing [15]. Risk classification should take into account number of records and type of data (e.g. PII, PHI, credit card data is more sensitive), see Table 12 in Online Appendix A.7.
Self-report via questionnaire or e.g. scrutiny by a service provider hired by the insurer Relevance is clear, but e.g. emphasized in Refs. [5,60]. Usually taken into account for insurance pricing [15]. For details, see Table 13 in Online Appendix A.7. Relevant for exposure to supplier attacks.

Number of suppliers
yields an overall number of losses and incidents for each firm on each step of the grid, such that the corresponding severities can be drawn from the appropriate (time-and covariate-dependent) distribution.

Cumulative loss distribution
First, we examine the number of incidents/losses and the distribution of cumulative losses in the full portfolio ( K = 500 ) in Fig. 2a and b. We compare the case where only actual losses are counted with the case where all incidents are counted (i.e. the worst case where all incidents cause a loss). At first glance, the two cases appear to be surprisingly similar. Notice, however, that due to the assumptions above, for most firms the rate of idiosyncratic incidents outweighs the rate of incidents from systemic events. Furthermore, the lower the security level for a given firm, the higher its contribution to the overall number of incidents, and simultaneously the lower the effect of distinguishing losses and incidents. Conversely, the higher the security level of a given firm, the less likely it is to be affected at all. Therefore, when only few cases are registered at all, these cases are likely to have occurred at firms with low security and are therefore unlikely to be filtered. The cases where most filtering occurs are large systemic events whose effect is clearly reduced (consider the range around [45,65] on the x-axis of Fig. 2b). Of course, this translates analogously to Fig. 2a, where particularly the tail of the distribution is altered (x-axis-range around [2500, 4000] in Fig. 2a). In this case, it additionally has to be kept in mind that incidents at well-protected firms-which are mostly filtered-are assumed to typically cause below-average losses. In both figures, one observes the difference in mean between counting losses and incidents, and that the mean is shifted clearly to the right from the mode of the body of the distribution. As expected from the assumptions above, we observe a shift of the cumulative loss distribution to the right as time progresses. To corroborate the simulation results, we generate 50.000 samples of incident/loss numbers following Proposition 1 and Eq. (12) and compare them in Fig. 2d. The simulation via Eq. (12) is much faster, but cannot be directly used to generate the cumulative loss distribution, as only samples of the total number of incidents/losses are drawn without information as to which firms they affect (and severity differs between firms). Furthermore, we compare the cumulative loss distribution for selected sub-portfolios in Fig. 2c, taking into account only simulation runs where a non-zero loss has been observed. As to be expected, we observe a shift of the body and tail of the loss distribution to the left as the security level increases. Understanding the cumulative loss distribution-especially in the tail-is particularly interesting in the context of reinsurance, where common contract design involves so-called excess-of-loss reinsurance, meaning that (portfolio) losses exceeding a pre-specified limit are ceded. For this case, an accurate understanding of the portfolio loss distribution and its tail is clearly essential. Apart from these considerations, insurers are mostly concerned with the pricing of individual policies. This is addressed next.

Premium calculation
Based on the distribution of individual losses, we calculate the first-year premium based on the expected value principle given in Sect. 4.4. 21 Table 7 compares the following exemplary firms: Firm 1: A small manufacturing business with low data and supplier risk and low IT security standards ( c = 0.15). Firm 2: A medium-sized company in the financial sector with medium data and supplier risk and high IT security standards ( c = 0.85). Firm 3: A large health care provider with high data risk, medium supplier risk, and average IT security standards ( c = 0.55).
The results show that the IT security level dominates the covariate-effect on the premium, which is in line with the assumptions. As in practice, the calculation of expected losses is typically based on historically recorded (rare!) losses, only very few firms with the exact same covariate combinations might be in the portfolio and therefore, the premium is rather calculated based on all losses within a class of firms considered homogeneous. New firms falling into the same class are then assigned the same premium. The quite difficult task is to find an appropriate way of partitioning firms into homogeneous groups. If we partition firms according to their IT security level and calculate their premium by taking into account all firms with the same level, we obtain the results shown in Fig. 3. 22 As to be expected, the premium  Footnote 22 (continued) the loss history of the-assumed homogeneous-portfolio. Combining the two "extremes" of considering only individual loss experience and only loss experience from a homogeneous group lies at the heart of credibility theory approaches and will not be addressed here.
decreases with increasing security level, while the difference between incidents and losses increases. As an alternative to (and validation for) the Monte Carlo simulation, we have furthermore implemented a Panjer recursion scheme using a discretized version of the severity distribution; the results are given in Online Appendix A.5 and corroborate the ones given here.

Risk measurement on individual and portfolio level
We compare VaR and AVaR for the three firms described above and two sub-portfolios in Table 8, as well as for all sub-portfolios in Fig. 4a and b. The historical estimate refers to the sample quantile from the simulation data, i.e. for a realisation of losses = (L 1 , … , L n ) , let L (1) < L (2) < … < L (n) denote the order statistics, then for a chosen level (1 − ) ∈ i−1 n , i n , VaR 1− and AVaR 1− are estimated as their empirical counterparts The POT estimate assumes that for a large threshold u, the excesses are distributed according to a generalized Pareto distribution GPD(u, , ) , and thus VaR 1− and AVaR 1− can be estimated as (see, e.g. Ref. [57]) where ̂ and ̂ are the parameter estimates of the scale and shape of the GPD given the data and n ′ is the number of threshold exceedances. As to be expected, both VaR and AVaR decrease with increasing security level, while the reduction when considering only losses instead of all incidents is more substantial. Note again that 3 We compare the premium (with loading 0.2 as above) that would be assigned to firms if they were grouped according to their IT security level. We observe that simulated values are now very close to theoretical ones, as they depend on the loss history of a sub-portfolio of 50 firms, such that Monte Carlo noise is reduced (compared to Table 7). We furthermore compare the values for the single firms from Table 7 with the portfolio they would be grouped into, and observe that e.g. firm 1, when evaluated on its own, is slightly less risky than the average firm in sub-portfolio 2 for individual firms, the numbers are based on their own loss history only and should be interpreted with care.

How relevant is accumulation risk?
We have repeatedly stressed the distinction between idiosyncratic incidents and systemic events and emphasized that the latter can lead to accumulation risk (re-) insurers should be particularly worried about. One might now question whether the effect of including systemic events on the loss distribution warrants such a more complicated model. In order to answer this question (spoiler alert: yes!), we compare the results above with the results of a model that assumes the same marginal frequency as before for each firm, but assumes all incidents to be idiosyncratic, i.e. to occur independently from other firms. Intuitively, this should lead to the same premium for each individual contract, but decrease portfolio risk. Following Eq. (11), the overall number of incidents N ⋅ j (T) at each firm j ∈ {1, … , K} is generated using two independent Poisson r.v.
independently from all other firms. To be able to compare the two cases in each run, the number of losses N ⋅ j (T) at each firm j ∈ {1, … , K} is then generated as With the severity distributions remaining unchanged, an analogous simulation study as above is conducted. Again, we first examine the overall distribution of the cumulative portfolio loss and number of incidents and losses in Fig. 5a and b, respectively. The difference to Fig. 2a and b is immediately evident: -The visible heavy tails for both incident numbers and cumulative losses have vanished; thus it can be assumed they have been caused by systemic events with many firms affected simultaneously. -In particular, the highest observed number of losses has decreased to around 17% of its previous value in both considered years, while mean losses and mean numbers of incidents/losses have stayed unaffected. -The difference between incidents and losses is more directly visible, as in the independence case the body of the cumulative loss distribution is directly affected. This is because individual incidents are now filtered instead of the filtering impacting only systemic events, whose occurrence mostly alters the tail of the distribution.
From these findings, we conclude that incorporating systemic events into the model to capture potential accumulation risk is essential. We furthermore report VaR 0.99 and AVaR 0.99 for all sub-portfolios in Fig. 6a and b, respectively. Comparing them with Fig. 4a and b yields the same to-be-expected decreasing pattern as the security level increases, but the absolute values of the risk measures can be observed to have about halved. Perhaps it should rather be put vice versa: By including systemic events compared to complete independence, for the same expected overall . number of incidents, the risk measures VaR 0.99 and AVaR 0.99 on sub-portfolio level double.
As the marginal frequency and severity for each firm remain unchanged, calculated premiums should not differ from the previous simulation study; this is corroborated in Online Appendix A.5.
We have mentioned before that very heavy-tailed loss severities characterize cyber risk, and previous studies typically suggest such heavy tails that any moments higher than first order do not exist (and in some cases tail parameter estimates even yield infinite-mean scenarios). Even a finite-mean, infinite-variance scenario (as above, with 0.5 < < 1 ) is cumbersome to deal with, as e.g. only premium calculations based on the first moment can be applied. "Luckily," in the insurance context, one typically does not deal with loss severities (without upper limit) directly, but rather with claim sizes, which are typically bounded from above by the introduction of a cover limit, a maximum amount the insurer is obliged to cover for each loss. The effects of this contract design feature are examined next.

Cyber policy design: the effect of cover limits
In practice, typical (primary) insurance contracts include a cover limit, as the insurer seeks to bound losses from single, extreme incidents. This, however, can lead to a supply-demand-mismatch: Insurers, still cautious of this new risk type, prefer relatively low cover limits (with a few exceptions, see the overview in Ref. [25]) that are sufficient to cover day-to-day cyber incidents, while many firms particularly seek protection for extreme scenarios such as a large data breach or long BI. [14,58,60] reported the non-existence of adequate cover limits as one reason for firms to refrain from purchasing cyber insurance.
Mathematically speaking, the introduction of a cover limit M corresponds to the truncation of the loss distribution, i.e. each Y i ∈ [0, ∞) is mapped to a claim size Ŷ i via Y i ↦Ŷ i ∶= min{Y i ,M} ∈ [0,M] . Note that we assume a limit on each loss; alternatives might be a limit on the total loss over the policy duration or a limit on the number of covered claims. Assuming, however, a realistically small claim frequency, this does not make a large difference, as cases of multiple losses happening at the same firm during a single policy year are extremely unlikely. Table 9 reports the probabilities of exceeding different cover limits for a large severity event and three different covariate combinations: the baseline case (year 1, s = d = nsup = 1 , c = 0.5 ), the lowest-risk case in the portfolio (year 1, s = d = nsup = 1 , c = 0.95 ), and the highest-risk case in the portfolio (year 5, s = d = nsup = 3 , c = 0.05 ). To find the probability of an incoming claim to exceed the cover limit, we condition on observing a large claim event, i.e. in the notation of Sect. 4.3: Table 9 ℙ(L ij > u ij ) ��������� We now assume the cover limit for all contracts to be M 2 and run the same simulation as before. This could be generalized to allowing different limits depending on the insured's characteristics, e.g. a certain IT security level could be considered a prerequisite for a contract with a high limit. Similarly as above, Fig. 7 displays the (simulated) premium. While changes in the absolute numbers for the expected value principle are minor, the use of other common principles are now viable (all moments exist for the truncated losses) and deliver stable results. and AVaR 0.99 analogously to above are given in Online Appendix A.6. As to be expected, the introduction of a cover limit leads to an overall decrease in both risk measures, where the effect is higher for sub-portfolios with lower security (who tend to suffer the most severe losses and are therefore most impacted by a cover limit) and AVaR 0.99 decreases more than VaR 0.99 in absolute numbers (see Figure 9 in Online Appendix A.6).

Conclusion
We have presented an actuarial approach to modelling cyber risk that is consistent with the characteristics of the underlying risk factors from an economic and information-technological viewpoint. For this purpose, the existing literature on technical, statistical, economic, actuarial, and legal aspects of cyber risk was analysed in detail to identify relevant risk factors and plausible distributional assumptions within an actuarial framework. By construction, the resulting model is able to capture accumulation risk stemming from multiple firms being simultaneously affected by a cyber event; a prospect that insurers are especially worried about. Some distributional properties of the model and their relevance in the cyber context were highlighted. Moreover, we demonstrated how the model can be implemented in an insurance context using a loss distribution approach. An illustrative simulation study makes use of this implementation and derives the yearly premium for individual contracts as well as common portfolio risk measures. The model is stressed in different directions (contract design, the omission of systemic events) and the findings are analysed from the perspective of an actuary. Given the scarcity of available data on cyber losses, let us reiterate that distributional assumptions and concrete parameter choices rely on the existing literature (scattered across different disciplines) and expert judgments, hence, all quantitative findings should be interpreted with some caution in the light of model/parameter risk. Naturally, since the model presented here is not challenged on data, it is limited to its specific assumptions, e.g. using a Poisson process for arrivals; for the exemplary simulation study, these assumptions are further simplified to illustrate the actuarial exercise. However, to account for updates in the future, we consciously use a modular design that could allow to alter/ replace parts of the model or to adapt it to a specific portfolio an insurance company works with. Table 9 Conditional exceedance probabilities ℙ(L ij >M | L ij > u ij ) × 10 2 of three cover limits for large severity incidents We observe that a cover limit in most cases impacts only very few (large) claims Many interesting aspects, however, remain open for future research. Once sufficient cyber risk data is available, optimal estimation procedures and out-of-sample tests for the model assumptions are called for. Less theoretical, but equally important, appears the economic/legal question of categorizing cyber incidents. From the actuarial perspective, extremely interesting is the question of (optimal) cyber insurance contract design. Currently offered cyber insurance products seem to reflect the lack of an established common understanding of cyber risk and the resulting caution with which many insurers approach the topic. A better understanding of the underlying dynamics of cyber risk will in time hopefully enable product design to reflect economic optimality criteria instead of the insurers' operational limitations. Furthermore, what separates cyber from most other loss categories is the potential of designing cyber insurance products that transcend mere risk transfer, e.g. by including incident response teams or other services. To the best of our knowledge, this (non-traditional) part of cyber insurance contract design has not yet been addressed from an academic actuarial science viewpoint.

Fig. 7
We compare the premium (according to the three principles introduced in Sect. 4.4) that would be assigned to firms if they were grouped according to their IT security level and every claim in each contract had a limit of M 2 Funding Open Access funding enabled and organized by Projekt DEAL.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http:// creat iveco mmons. org/ licen ses/ by/4. 0/.