It seems to happen almost weekly nowadays: another massive brand hit by a data breach, with no real understanding of how or when it happened, or how many people are affected. The media - and the information management community - have been waiting for a GDPR test case since the legislation was first dreamt up. We sat down with Simon Parkinson, COO at Dot Group, to talk about what goes into a data audit. We talked data maps, asset registers, subject access requests and other tricks that ought to go some way towards keeping you and your organisation out of the headlines!
Simon’s been with Dot Group for six years and stays busy exploring smart analytics, innovative storage architectures and exciting cognitive deployments for clients looking at integrating next-gen technologies with strong commercial objectives.
Hey Simon! Thanks for taking the time to talk today. So what happens during a data audit? How long does it take?
No problem! Well, it’s a tough question to answer, because it really varies from client to client. Sometimes clients might have a good handle on their structured data, but it’s the unstructured data they’re grappling with. Other times, the lead information management person might have left, and the rest of the team are trying to pick up the pieces with a mess of spreadsheets, bits of paper and databases scattered around everywhere. For a small business, we might be in and out within three weeks, but larger projects might be spread over a few months.
OK, so what’s the first thing you do?
We need to figure out anywhere that data might be stored by a business. We’ve got some really clever software that accelerates the whole discovery process dramatically, and will essentially give us a map of a client’s network infrastructure, complete with all their data repositories highlighted for us. Automated data discovery makes a massive difference, because it gives us more time to focus on the triage and remediation. After the discovery, we can put together a data map - or an information asset register, and figure out what data resides where. They end up being really helpful when you have to deal with things like Subject Access Requests (SARs), as well as just being good practice in information and data management generally.
Triage sounds a bit dramatic?
Yes, I guess it does. But once we know where the data is, we need to make sure it’s all classified properly. We need to categorise it based on sensitivity, and associate some risk alongside each category. That’s where some experience helps. I mean, I’ve seen contact centres where 16-digit credit card numbers being sent by email or instant messaging around a contact centre because someone needs a refund. They’re under massive pressure to resolve the query as fast as possible, and just resort to whatever gets the job done. That kind of thing would get flagged at that stage.
The credit card numbers would obviously be pretty high risk, and then low-risk could be something like job adverts - they’re designed to be published online anyway. It’ll all go into a report and you’ll get a risk score as an end result.
What kind of things does the report say?
The report compiles all of the detailed intelligence and distills it into a risk score to give a client a good overview of their risk from an information management perspective, and then outlines steps for remediation. For instance, it might be that a client has a database holding customer transaction history, with names and addresses and everything, but that database isn't stored and run on a secure server. That’s quite an easy fix.
Where things can get a bit more complicated is where you have lots of third parties involved, because under GDPR, if your data is shared or processed with or by anyone else, then the responsibility is still yours. Our discovery tools will automatically highlight the dependencies and consuming endpoints of any data in your network. It might be that you have an FTP server to dump records from a CRM to an email marketing agency, or your finance team use an external payroll provider with all its data in the cloud. These would be flagged as potential risks, so you’re aware of them.
And who does the fixing?
Well, again that really depends. Sometimes the client will take the report away and digest it, and carry out some of the simpler recommendations themselves and bring us in again to help with the more technical ones. Other times we carry out all the remediation fixes too as part of an outsourced service, further augmenting with a ‘Data Protection Officer as-a-service’ offering. Again, it often depends on the size and complexity of a client’s infrastructure, but we often end up working quite closely with network architects and database admins in order to tighten up data protection and privacy. We can also get quite far using pretty neat software too - non-compliant databases be layered with obfuscation and masking tools, for instance, which might mean you don’t need to radically change the network architecture in that case.