Skip to main content
Log in

Addressing insurance of data breach cyber risks in the catastrophe framework

  • Published:
The Geneva Papers on Risk and Insurance - Issues and Practice Aims and scope Submit manuscript

Abstract

Considering data breaches as a man-made catastrophe helps clarify the actuarial need for multiple levels of analysis—going beyond claims-driven loss statistics alone—and calls for specific advances in both data and models. The prominent human element and the dynamic, networked and multi-type nature of cyber risk are perhaps what makes it uniquely challenging. Complementary top-down statistical and bottom-up analytical approaches are discussed. Focusing on data breach severity, we exploit open data for events at organisations in the U.S. We show that this extremely heavy-tailed risk is worsening for external attacker ‘hack’ events. Writing in Q2 of 2018, the median predicted number of ids breached in the U.S. due to hacking in the last 6 months of 2018 was 0.5 billion, with a 5% chance that the figure exceeds 7 billion, doubling the historical total. ‘Fortunately’, the total breach in that period turned out to be near the median.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

Notes

  1. Of basic personal information ‘ids’, such as name, social security number, address, etc., as well as accounts, transactions, and privileged communications. Article 4(12) of the GDPR defines a personal data breach as “a breach of security leading to the accidental or unlawful destruction, loss, alteration, unauthorised disclosure of, or access to, personal data transmitted, stored or otherwise processed.”

  2. For example, software versus network based events (Boehme and Schwartz 2010; Öǧüt et al. 2011; Mukhopadhyay et al. 2013), and data loss or unauthorised modification, data breach, and financial fraud (Cebula et al. 2010; Biener et al. 2015; Eling and Schnell 2016a).

  3. See Gordon and Sohail (2003), Bandyopadhyay et al. (2009), Mukhopadhyay et al. (2013), Eling and Wirfs (2015) as well as Eling and Schnell (2016a, b).

  4. https://www.weforum.org/agenda/2019/03/what-happens-in-an-internet-minute-in-2019/.

  5. https://www.microsoft.com/security/blog/2016/01/27/the-emerging-era-of-cyber-defense-and-cybercrime/.

  6. https://www.zdnet.com/article/cloud-computing-will-virtually-replace-traditional-data-centers-within-three-years/.

  7. Such as the concentration of semiconductor manufacturing in a region of high flood risk in Thailand, see Romero, J. “The Lessons of Thailand’s Flood”, IEEE Spectrum, 1 Nov 2012. Also, in November 2017, it was confirmed that millions of Intel chips have a major vulnerability, potentially allowing arbitrary remote code execution and privileged information access. See https://www.us-cert.gov/ncas/current-activity/2017/11/21/Intel-Firmware-Vulnerability.

  8. Based on 75% of worldwide browser traffic on the Internet, Maillart et al. (2011) document a massive “law of procrastination” in the form of a general class of power law behaviour of the waiting times to update outdated software. See also Saichev and Sornette (2010).

  9. "Collection #1". https://www.troyhunt.com/the-773-million-record-collection-1-data-reach/.

  10. See https://www.enisa.europa.eu/news/enisa-news/european-commission-proposal-on-a-regulation-on-the-future-of-enisa.

  11. https://www.csoonline.com/article/2130877/the-biggest-data-breaches-of-the-21st-century.html.

  12. Among other conditions, with event loss in excess of USD 20–50 million, depending on sector.

  13. There is no commonly agreed definition of resilience: in general, it is the ability of the system to sustain or restore basic functionality following a risk source or event (even unknown)—SRA (2015). A concrete definition of resilience for critical infrastructure has been given by Kröger (2019).

  14. “Expert Workshop on Improving the measurement of digital security incidents and risk management”. OECD. 12–13 May 2017. http://www.oecd.org/sti/ieconomy/improving-the-measurement-of-digital-security-incidents-and-risk-management.htm.

  15. See http://www.privacyrights.org/data-breach.

  16. Uber Paid Hackers to Delete Stolen Data on 57 Million People, By Eric Newcomer, 21 November 2017 https://www.bloomberg.com/news/articles/2017-11-21/uber-concealed-cyberattack-that-exposed-57-million-people-s-data.

  17. To test empirically, one could look at data breaches in each state separately and see if there was a change point when a new cyber law was introduced. This could be an interesting way forward but goes beyond the current paper.

  18. Highest frequencies are in New York and California, with highest severities in Nebraska, Nevada, and D.C. See Appendix A1.

  19. Stolen identities have been used for fake comments online, distorting the appearance of important dialogues, and “hacking consensus”. See, for example, the information on Hackernoon: https://hackernoon.com/more-than-a-million-pro-repeal-net-neutrality-comments-were-likely-faked-e9f0e3ed36a6.

  20. Useful categories could include: external/internal actor, data media (hard or software), attack strategy/mode, intentional or accidental, actual effect or potential (i.e., precursor), data type, aggregating factors (for example, distributed online), “cost” (for example, total fraud, total liability, etc.).

  21. Uber Breach, Kept Secret for a Year, Hit 57 Million Accounts. The New York Times.  November 22, 2017. See https://www.nytimes.com/2017/11/21/technology/uber-hack.html.

  22. One sided Z-test of the growth rates of the u = 106 and u = 104 fits gives p = 0.007, indicating significantly faster growth of more extreme breaches.

  23. The distribution such that the natural logarithm is the lower-truncated normal distribution with parameters (µ, σ2). See Malevergne et al. (2011) for examples, and the uniformly most powerful unbiased test against the Pareto tail.

  24. Pareto distribution with upper truncation set to the size of the largest HACK event: 3 × 109.

  25. According to a Chi square likelihood ratio test, only the untruncated Pareto is significantly worse (p < 0.01).

  26. With intercept 13.1 (2.3), slope 1.7 (0.4), and NB dispersion parameter 14.3 (6.65), for example, predicting a level of 13.1 events per 6 months in 2005, and approx. 35.2 by Q3 2017, where the exponential model predicts 38.4.

References

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Spencer Wheatley.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix

Appendix

A1: Cyber events in the U.S. by state

Figure 7 shows the 12-year frequency and median severity of information items compromised by U.S. state. As can be seen, the highest frequencies are in the states of New York and California, followed by Texas and Ohio. The highest severity, interestingly, is for Nebraska, followed by Nevada and District of Columbia.

Fig. 7
figure 7

12-year frequency and median severity of ids (information items compromised) by U.S. state. Notes: We use only breach events from the PRC [rather than OSF DLDB] database, where state information is readily accessible. States with ≤ 10 cyber incidents over the 12 years were omitted [i.e., Alaska, Arkansas, Delaware, Hawaii, Idaho, Kansas, Louisiana, Maine, Mississippi, New Hampshire, New Mexico, North Dakota, Rhode Island, South Dakota, Vermont, West Virginia, Wyoming]. The inset panel magnifies the cluster near the origin

A2: Chronology of cyber events: is there a general trend?

There is a strong trend towards increasing frequency and severity of hack data breach events with a size > 10,000 k. This very strong trend is not distinguishable for overall breach events in excess of 10 k (see Figs. 8, 9).

Fig. 8
figure 8

Chronology of high-impact cyber events involving ids > 25,000 k compromised during 01/2005 through 9/2017. We can see a strong upward trend in severity here. Note: Natural logarithmic scale for ids along the ordinate axis

Fig. 9
figure 9

Chronology of all breach events from 01/2005 through 9/2017. As can be seen, the overall picture for the severity shows a weak upward trend. A more specific view (as shown in Fig. 8) is necessary to identify trends. Note: Natural logarithmic scale for ids

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Wheatley, S., Hofmann, A. & Sornette, D. Addressing insurance of data breach cyber risks in the catastrophe framework. Geneva Pap Risk Insur Issues Pract 46, 53–78 (2021). https://doi.org/10.1057/s41288-020-00163-w

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1057/s41288-020-00163-w

Keywords

Navigation