Ten ways to shine a light on dark data in pharmaceuticals and find value in it

How businesses can benefit from finally taking control of the great unknown

John Culkin, Crown Records Management

Just hearing the words "dark data" can be enough for managers in pharmaceutical sector to feel a little uneasy; but, despite its rather menacing name there has never been a better time to tackle the phenomenon head on.

When you think of the recent explosion in data it’s hardly surprising that businesses worry about how dark data — defined as "information assets that organisations collect, process and store during regular business activities, but generally fail to use for other purposes" — will affect them in future.

So why are so few doing anything about it?

Rather than shy away from the challenges, the questions we need to be asking are as follows: How do you get dark data under control and how can you derive benefit from it?

The threats provided by an explosion in data are obvious; businesses that don’t know what information they have or how to find it are opening themselves up to the possibility of a data breach and in danger of falling foul of data protection regulation as it continues to evolve.

In pharma, the problem is particularly acute.

Regulation demands strict quality in manufacturing and research for safety reasons. However, raw clinical data or clinical trials are often in incompatible format, which then loses the ability to infer interesting insights from it.

Like all large organisations dependent on many complex systems, pharmaceutical companies will strive to meet their compliance needs; however, they are still likely to have data hidden in their systems that is either not necessary to keep or could contain potentially useful information.

By finding and exposing the data and finding all the places it hides, gains can actually be realised. But the bottom line is organisations need to know what dark data is and where it is in the first place. If you don’t understand it how can you possibly be in control of it, let alone realise value hidden in it?

The problem is that hiding dark data away seems to be not just planned but instinctive, as if it were down to human nature.

People fill their attics and garages with "useful" things that could be needed in the future. The corporate equivalent is when people say: "Keep the data, just in case."

How much money around the world is wasted due to those few words?

Perhaps data is being kept "in case it is needed in the future," assuming it can be found at the time of need, of course. Or because a company believes some unknown future insight can be found using future analytical technology.

Perhaps it is stored away out of a mistaken belief that keeping more data means a company is less likely to lose something. Or because a business believes data is like a rare book and its monetary and nostalgic value might increase.

All of these misconceptions lead a hoarding of dark data, which is unnecessary, not useful and inherently risky.

Having a proper information management system in place to prevent the storage of too much dark data is crucial. Here, then, are some pointers for those in the pharmaceutical industry on how to store less dark data and how to find value in what is kept.

Top tips to prevent dark data accumulating – and then to derive benefit from it

Perhaps the first place to start for those in pharma is to realise that data may go to waste or sit unused if it is not properly managed – or if systems are not properly designed in the first place. This can include data which may contain potential insight for the future.

All industries are likely to see an explosion in data as the Internet of Things takes hold – and for pharmaceuticals it seems inevitable. The vast amount of data that genomics and personalised medicine will generate in future massively increases the volume of data being stored, creating further challenges.

Reducing the amount of dark data kept is crucial, so don’t accept that because data storage is cheap everything can be kept.

Develop policies and continuous training for all staff about managing data (including data on local drives, laptops, shared drives, removable devices and mobile devices).

Understand which regulations require what data to be kept for how long. Don’t keep it longer than necessary unless clear benefit can be derived from it. A quarter of respondents in a Crown Records Management survey this year said they were "not confident" they knew how long they were legally obliged to keep data. This is a major concern.

Don’t believe all data in enough volume is “Big Data” and therefore has value. Much of it will not be useful in big data projects, especially as it’s often unstructured and in various formats.

Use File Level Analysis software, to analyse the content of data rather than just its creation and expiry date. It can help a business understand information content, type, size and location.

Map data sources generated by internal systems or received from external sources; this is about documenting "where" and "why" the data lives and how it was derived.

Have an Information Governance programme in place supported by the most senior management levels and practiced every day.

Talk and communicate with people because not everything will be officially documented or known about. Workarounds and short-term fixes often become routine in the business place, so make people aware of the impact of not being able to manage everyone’s data effectively — it affects everyone.

Only after taking these steps can managers in the pharmaceutical industry begin to find, understand and derive benefit from data they previously looked on as nothing more than an expensive waste product – and reduce costs and risks by keeping less in the first place.

Shining a light on dark data is just the start – but it’s a very good start.