GDPR quick tip: Know what data (models) you have

Amid all the kerfuffle around the General Data Protection Regulation, GDPR (which applies to any organization handling European citizen data, wherever they are located), it can be hard to know where to start. I don’t claim to be a GDPR expert – I’ll leave that to the lawyers and indeed, the government organizations responsible. However, I can report from my conversations around getting ready for the May 25th deadline.
In terms of policies and approach, GDPR is not that different to existing data management best practice. One potential difference, from a UK perspective, is that it may mean the end of unsolicited calls, letters and emails: for example, the CEO of a direct mail organization told me it may be the demise of ‘cold lists’, that is, collections of addresses to be targeted without any prior engagement (which drives many ‘legitimate interest’ justifications), contract or consent.
But this isn’t a massive leap from, say, MailChimp’s confirmation checks, themselves based on spam blacklisting and the right to complain. And indeed, in this age of public, sometimes viral discontent, no organization wants to have its reputation hauled over the coals of social media. When they do, it appears, they can get away with it for so long before they cave in to public pressure to do a better job (recent examples, Uber and a few budget airlines).
All this reinforces the point that organizations doing right by their customers, and therefore their data, are likely already on the right path to GDPR compliance. The Jeff Goldblum-sized fly in the ointment, however, is the conclusion reached in survey after survey about enterprise data management: most corporations today don’t actually know what information they have, including about the people with whom they interact.
This is completely understandable. As technology has thrown innovation after innovation at the enterprise, many have adopted a layer-the-new-on-top-of-the-old approach: to do otherwise would have left them at the wayside long ago. Each massive organisation is an attic of data archival, a den of data duplication, a cavern of complexity. To date, the solution has been a combination of coping strategies, even as we add new layers on top.
But now, faced with the massive potential fines (up to 4% of revenue or €20 million), our corporations and institutions can no longer de-prioritise how they manage their data pools. At the same time, there is no magic wand to be waved, no way of really knowing whether the data stored within is appropriate to the organization’s purposes (which indeed, may be very different to when they were established).
Meanwhile, looking at the level of systems is not going to be particularly revealing, so is there an answer? A starting point is to look somewhere in-between data and systems, focusing on meta-data. Data models, software designs and so on can be revelatory in terms of what data is held and how it is being used, and can enable prioritization of what might be higher-risk (of non-compliance) systems and data stores.
Knowing this information enables a number of decisions, not only about the data but also what to do with it. For example, a system holding information about the children of customers may still be running, without anyone’s real knowledge. Just knowing it is there, and that it hasn’t been accessed for several years, should be reason enough to switch it off and dispose of its contents. And indeed, even if 75% of marketing data will be ‘rendered obsolete‘, surely that’s not the good part anyway?
Even if you have a thousand such systems, knowing what they are and what types of data they contain puts you in a much better position than not knowing. It’s not a surprise that software vendors (such as Erwin, founded as a data modelling company in the 90’s, vanished into CA, divested and portfolio broadened), who have struggled to demonstrate their relevance in the face of “coping strategy” approaches to enterprise data governance, are now setting their stalls around GDPR.
Again, no magic wands exist but the bottom line is that it is becoming an enforceably legal requirement for organizations to be able to explain what they are holding and why. As a final thought, this has to be seen as good for business: focus on what matters, the ability to prioritize, to better engage, to deliver more personalized customer services, all of these are seen as high-value benefits above and beyond a need to comply with some legislative big stick.

Is re-regulation, not deregulation the answer to the financial world’s continuing woes?

The amount of financial regulation in the world continues to increase, creating an ever-growing burden on banks and other financial institutions. Banks only have themselves to blame, goes the pervading view, creating exploitative situations such as sub-prime mortgages and credit swaps, thus collapsing the bond of trust they maintained with their customers.

But the reality is that we all suffer under the cosh of increased bureaucracy and cost, with no real benefit other than (we hope) reducing the risk to ourselves of being exploited, or indeed, of the 2008 financial crisis from happening again. In part the collapse of global finance was caused through direct exploitation, but a bigger crime was how financial organisations demonstrated their institutional incompetence.

They had one job — to support and protect the dollars and cents of their customers — but organisations from Lehman Brothers to RBS showed not only their ineffectiveness against corrupted behaviours, but also their poor grasp of shared risk. Above all, and even with the caveats of how complex the situation became, our smart-suited financiers proved themselves to be really crap at maths.

The result was the undermining of global confidence. “2008 saw the collapse of trust and legitimacy,” said Anne Leslie-Bini of consulting firm BearingPoint, at a recent analyst event in Paris. “Governments, central banks and trusted financial intermediaries found themselves brutally exposed, meaning the public lost faith in the very institutions that are meant to represent, protect and further their interests.”

We are still dealing with the consequences, one being more power in the hands of the regulators — whose systems were also proved to be ineffective, but who no doubt feel the are doing the right thing by creating more. The result is a continued flood of regulations — nearly ten times as many publications being released per year compared to pre-1994 levels. “There is no end in sight,” continued Anne.

Of course regulators will have the best intentions, but the effect is to stymie financial institutions without necessarily dealing with the potential for rogue trading, product mis-selling or other, yet to emerge financial malpractice or imploding bubble. It reinforces the notion that working within a regulation is by definition ethical — the “I did nothing wrong” school of thought.

At the same time however, we are not seeing any regrowth of trust. Precisely the opposite could be said to be true, in this environment of fake news and political spin. We live in a context of post-truth where nobody knows who to trust, which can be exploited by both the untrustworthy and those looking to gain from promoting distrust. Such a situation ultimately serves nobody.

We are neither willing nor likely to go back to that rose-tinted world where the default behaviour was blind trust in our institutions and elders, as computers and the economics of big business have put paid to that. All the same we need a rethink in how we develop and deliver regulation, one which aligns with how the world is today rather than trying to follow a historical, institution-based model.

According to Anne Leslie-Bini, this is the opportunity presented by regulatory technology (RegTech) — we can fight like with like, creating regulations according to the same principles as the technologies used by institutions. So for example, rather than expecting banks to produce monthly reports, access to banking data should be available in real-time, via APIs.

Such ideas can be taken much further, however. If we are in the platform economy for example, regulation can, and should be built into the platform. Just as “Data should come out of the pipe clean,” as Cisco’s Charlie Giancarlo once pointed out, so should it be expected that the virtual money coursing around our networks is correctly sourced and with traceable provenance (using Blockchain, for example).

Thinking beyond technology, a third pillar is to consider how business practice is changing and to expect regulation to follow suit. So, if agility, scalability, co-creation and customer experience are key levers for driving business value, so should we expect agile, scalable regulation and so on. Co-creation of regulations is a model already tested and being proven by regulators and financial organisations in Austria.

Such thinking is a long way from the expectation of sheep-like compliance with laws conjured up by some inaccessible people in a distant corner of the globe. “As long as we are operating in a system where we have to constrain behaviours that act contrary to the common good, regulation will always be playing catch-up.,” says Anne.

The future is not about de-regulation but re-regulation, delivering models that enable our very necessary institutions to fit how the world works today. Regulation should not be based on building ever-higher walls around our financial institutions but aimed towards striking a balance, to deliver the right levels of protection for citizens and businesses within a framework of ethics that increases, rather than undermines trust.

[Disclaimer: BearingPoint is a client]

DAM – That’s Secure?

The holy grail of database protection may come in the form of DAM married to an agentless platform that employs Artificial Intelligence (AI)

After handing over license plate info, Uber re-opens its NYC bases

Uber has relented against New York’s Taxi and Limousine Commission (TLC) and handed over its data. As a result, it can now open the five Uber dispatch bases that the TLC shut down when Uber refused to comply.

New York City passengers won’t see much of a difference in service. The bases are locations for administrative work, so them being shut down inconvenienced drivers who needed to go for licensing tests and training, not riders. Furthermore, the TLC had decided to block Uber’s request to open a base in Brooklyn until it handed over trip information.

The ride-hailing company publicly announced its decision to offer more data to cities last month, starting with Boston. At the time, it said it would be anonymizing all the data and offering it in aggregate to cities so they can make policy decisions like planning public transit routes.

But the TLC required Uber to include vehicle license plate numbers if it wanted its bases reinstated. That means Uber will be giving that specific, non anonymized driver information at least to the New York City government. The company told the New York Business Journal it’s doing so “under protest.

Uber initially cited trade secrets for not wanting to give up its data. It may not have wanted local governments to being able to send their taxis to popular Uber areas at peak times, for example. But as evidenced by the blog post on giving data to Boston and other cities, the company had a change of heart, realizing that it could offer data to cities as an olive branch and potentially ease Uber’s local regulatory conflicts as a result.

Microsoft buys Equivio to embed machine learning into Office 365

Microsoft announced the acquisition of Equivio, a provider of machine learning technology used to analyze documents for compliance and discovery. The company’s products have been used primarily in legal and regulated industries, and especially where large sets of documents need to be organized. As Rajesh Jha, Corporate Vice President, Outlook and Office 365, wrote,

Equivio’s solution applies machine learning to help solve these problems, enabling users to explore large, unstructured sets of data and quickly find what is relevant. It uses advanced text analytics to perform multi-dimensional analyses of data collections, intelligently sorting documents into themes, grouping near-duplicates, isolating unique data, and helping users quickly identify the documents they need. As part of this process, users train the system to identify documents relevant to a particular subject, such as a legal case or investigation. This iterative process is more accurate and cost effective than keyword searches and manual review of vast quantities of documents.

Equivio has penetrated the legal community, U.S. federal agencies, corporations and other organizations.

Microsoft plans to extend the use of the technology in Office 365, so that it is more broadly applicable.

Microsoft has not released the spend for the buy, but it is conjectured to be in the $150 million to $200 million range.

I’ve often wished for a smart tool to automatically organize my documents — and possibly generate better file names —  based on analysis of their contents. This has become even more of an issue as I have documents spread across Google Docs, Dropbox, and my hard drive. I’ve been expecting Google to build more into Google Drive in that regard, but so far: not much.


Salesforce finally solidifies European data center plans

The UK-sited data center, which should help settle the compliance worries of many of Salesforce’s European customers, will be completed in 2014. The firm is also running a €5 million Innovation Challenge for EU startups.

With new service, Nasdaq brings Wall Street data to Amazon’s cloud

Nasdaq OMX is offering a new service called FinQloud for financial services clients that want to store regulatory data or analyze trade data using on-demand resources. Built atop Amazon Web Services, the service seems to be the result of a close partnership between the two companies.

2012: Cloud computing hits adolescence for better or worse

Here’s hoping 2012 will be the year we separate the wheat from the chaff in cloud computing. At the very least, more businesses will know about the potential benefits and pitfalls of cloud computing so they can differentiate the real from the bogus.