An insurer warning: Don't drown in the data lake

As more data is collected for machine learning models, more care must be taken with how data is stored, processed and accessed.

Insurance companies have a lot to be concerned about when it comes to keeping customer data safe. (iStock)

Insurers collect a lot of important personal and financial information. Recent cyber attacks against insurance providers have made some of us yearn for the days of black ink, paper and filing cabinets. At least then, someone would actually have to physically break into your building to steal important documents.

Now, all they have to do is send you an e-mail.

Insurance carriers know many of identity theft and insurance fraud risks out there. But there also are some newer risks about which insurers need to be aware. While the NAIC data security model law includes good recommendations but it fails to address newer types of attacks and newer types of software.

Related: How to protect customer data while complying with GDPR

Some things have changed

More and more companies are investing in big data and other business strategies using tools such as machine learning to generate unique insights. For instance, car accident claims predictions are hot, and  premium pricing models are being used by the likes of Axa.

The push to innovate is forcing insurance organizations to adopt different software environments than what they were used to before. Some approaches are achieving extraordinary results, but it all relies on lots and lots of data.

As more data is collected and more data is used for the machine learning models, more care needs to be taken to how data is stored, processed and accessed.

Many so-called data lakes are haphazardly drawn up because it usually is composed from a multitude of sources. It can be difficult to organize so many disparate sources of data across many different teams.

Compounding this, it seems that every other month we come across a new data breach where someone has mistakenly put a gigantic wad of data in a publicly accessible place.

Related: 6 ways cybersecurity will impact insurers in 2018

Avoid compromising your data

Data scientists and developers will invariably utilize cloud deployments (whether it’s public, private or hybrid) in their cybersecurity strategies. Private cloud models are arguably much easier to protect than public cloud models. But that doesn’t stop the march of public cloud adoption progress.

Insurance, being an industry that is highly regulated, has not adopted a lot of cloud resources. Those organizations that have face strict systems governance. While regulation routinely cripples insurance technology innovation, don’t throw the baby out with the bathwater.

For InsurTech companies, having a well defined secure software development life-cycle is paramount. There are a variety of methods and practice to introduce this. Some practices involve routine security audits, however they can be prohibitively expensive. Other practices involve run-time scanning and secure continuous integration practices. This is also a good practice but it comes with some friction for your developers.

Another newer method of provisioning software is now becoming more common as well. Utilizing a technology known as unikernels, remote code execution attacks which make up the initial intrusion of most data breaches, are prevented through the use of blocking other programs from actually executing on the end user’s computer. They are designed to only run a single program per virtual machine which is what developers basically do now with normal vms and containers anyways; this method just enforces it. This is what prevents most remote code execution attacks from working in them.

If you use a third-party provider, ask them how they are provisioning their software. If they are on a public cloud which is extremely likely drill into their architecture and ask them how they secure their own applications. Ask them if they have measures in place that disallow running other programs on the same instance as their code. If that’s not the case think twice.

It is important to remember that a lot of security models that were drawn up in the late 1990s are not congruent with the security models that must be imposed in the 2020s. We don’t deal with individual servers much anymore and the concept of multiple users on the same server is largely outdated as well.

Now we deal with massive amounts of servers and the operating model has been abstracted out much further with things like access controls being placed in the cloud software of choice. We need a new set of security standards and one that includes the fact that the ability to run other third-party programs on the same instance that is running your code should be a red flag.

Keep innovating, but don’t drown in the data lake.

Ian Eyberg is CEO of the San Francisco-based software company NanoVMs. He can be reached by sending e-mail to info@nanovms.com.

The opinions expressed here are the author’s own.

See also:

3 technologies supporting the InsurTech revolution

3 best practices for a layered cybersecurity program