At the time the Marriott data breach was announced, in September 2018, the Cyence Cyber Risk model was predicting that Marriott had a probability of 83% of having any incident and a probability of 43% of having a data breach specifically1. Perhaps more relevant, our model estimated a 12% probability of having an insurance relevant2 incident.
Given an event like a data breach of 500 million records to a company like Marriott, we would expect a median3 loss of approximately $150 million. The range4 of that loss could go from $100 million to $1.5 billion. The variation stems from a few factors. On the high end, while it is unlikely that GDPR would fine Marriott the maximum of 4% of global revenue (approximately $900 million), it is plausible. Starwood properties are also located across multiple geographies and frequented by international visitors, so it’s possible that several class action suits could arise. Another source of variation is from the record types. While Marriott may have lost up to 500 million records, it’s unlikely that all records contained sensitive5 PII or PCI.
Our modeling learning from an event like this is to evaluate and recalibrate as appropriate for our next model release. We already differentiate between PHI, PCI, and PII, but can we factor in different sensitivity levels of PII by line of business? The Marriott loss also involved a lot of records - 500 million people which is more than 6% of the world’s population. One way to reflect the possibility of an extreme6 number of records lost is to recalibrate the tail. Many severity models, particularly in the natural catastrophe space use heavy tailed distributions to model event severity. However, we need to be aware that a key difference for cyber is that we are modeling human behavior. Hackers who are interested in exfiltrating data will often try to take as many records as they can. The barrier to a hacker getting all of a company’s records is system segmentation7. This behavior means that a model of this behavior should account for some lumpiness8 to number of records lost and in this specific case, the exponential or heavy-tailed distributions are likely not appropriate.
Cyence’s main focus is not whether we got this right9, but how can we continue to improve our modeling estimates in order to better support our customers and the industry. The severity range is large (despite known event details), which highlights that there is a lot of uncertainty and room to refine our approach. The outsized high end of the range also suggests that there may be an insurance gap (Marriott reportedly has a $300 million tower). Events like this demonstrate why companies should purchase cyber insurance and in cases like Marriott probably should have even more coverage.
Over the last few years, we’ve seen an increasing number of towers with over $500 million in coverage. This has happened in part because of greater industry confidence in its ability to quantify that risk. Given the number of enterprises who choose not to purchase insurance or are underinsured due to liquidity constraints, we collectively as an industry and Cyence still have a long way to go. At each step in our journey, it is important for us to be explicit about what we can do fairly well today (DB frequency) and where we stand to improve (GDPR fines, Data type classification). We need to continue to educate ourselves and our partners to better use our information and models and to customize as needed based off their own judgement while they continue making insurance relevant capital decisions.
1 At least one incident that could have been covered under typical cyber policy if over retention for a 12-month period
2 10 million dollar threshold – given Marriott’s economic footprint and risk profile, we assume this would be the minimum retention for such a policy
3 We represent the median because it’s a more appropriate measure of centrality given the skew
4 90% confidence interval
5 Sensitive in this case could include passports (e.g. in the US – cost is around $110)
6 Extreme given it’s a hospitality chain
7 Records should generally be stored in different areas, hacker will need to find a way to access and extract each group of records
8 In a perfect world, we’d have detailed system segmentation data around record storage location and factor in number of records lost as a function of time a system was accessed. Even without this information, we recognize that this relationship shouldn’t be an exponential drop for each marginal record lost and it’s fairly likely to actually have all (in some segment) or none of records stolen.
9 This tends not to work well statistically anyways with sample size 1