From disease to cure

How open data is changing healthcare

Pharmaceutical organisations are becoming increasingly reliant on the data that is available to them to develop new products and overcome the modern challenges of curing diseases.

There is a large amount of information in the world that can aid the process of research and development (R&D). This has not always been openly available, largely because of Big Pharma’s desire to protect this information and maintain their competitive business edge.

This is now starting to change; businesses and organisations — both public and private — are starting to share an increasing amount of information between one another. This growing access to data is allowing organisations to apply new techniques to different areas of drug development that were not possible before: key examples being recognising disease signatures or applying the precision medicine model.

As pharmaceutical companies evolve to try to make the most out of data, the challenges they face change as well. Although open information is beneficial to the industry, utilising it is difficult.

Development and comparison

Owing to the amount of data now available, the capability to build a broad picture of different diseases, their symptoms and the effects they have in society has increased. This is particularly discernible with regards to recognising disease signatures for better decision making and elucidating the pathophysiology of diseases.

Recognising disease signatures is a difficult process. The many subtle relationships between different diseases can be tough to spot and compare, and without access to detailed information about each disease, building an understanding of where the correlations lie can be challenging.

Obtaining this comparative information is becoming a simpler process. Whereas comprehensive data sets about real-world outcomes were once hard to come by, these are now becoming increasingly open and available. This information stems from a range of sources, such as healthcare providers opening up their data for different organisations to access and the text-mining of unstructured data sources such as scientific literature and patents.

Access to all this data means the challenge is no longer obtaining information, but rather understanding how to collate, manage and analyse it. Long-established statistical techniques are still able to extract knowledge from this data; but, because of increased computing power, these are now being used alongside newer technologies and methods such as machine and deep learning techniques.

These are allowing pharmaceutical companies to identify relationships with greater accuracy and relevance in this multidimensional space that wouldn’t have been possible a few years ago. Machine learning gives life science organisations the ability to break down this vast data resource and identify the correlations that indicate potential disease signatures, building a better understanding of where comparisons lie.

The valuable insights now being uncovered by these teams are driving modern, data-driven initiatives within pharma, biotech and CRO companies alike. This is changing the way the industry operates — from early drug discovery, preclinical and manufacturing through to medical practices. Smaller biotech firms have been at the forefront of this change; with their greater agility compared with Big Pharma, they have free rein to build-out and develop these methods.

Larger organisations are realising this and applying their own analytics programmes of work or expanding via the acquisition of smaller, more innovative biotechnology companies. As the value from analytics grows further still, this trend will continue through the immediate, shorter-term.

Real-world data for real-world solutions

Data sharing models are not only impacting disease signature focused R&D. Precision medicine, the tailoring of specific care plans and drug regimens towards patients of a specific genome, has traditionally been a process that pharmaceutical companies have been reluctant to invest in. This is partly because of data limitations and the fact that there is not enough information available on specific patient groups to create effective medicines and personalised treatment plans quickly.

Additionally, the data required for precision medicine is distributed, heterogeneous and unstructured, often requiring additional data cleansing to be truly useful. Many organisations lack the resource and analytical capability to interpret this information. To truly harness precision medicine capabilities, there’s a need to blend multiple public data information sources together with real-world evidence to build a detailed picture of a specific genetic group. If there is not enough information about issues affecting certain patient groups in specific areas, then it’s not possible to develop effective solutions.

This requires a combination of medical health claims, adverse effect reports, patient drug responses and efficacy studies to be brought together from disparate places and analysed. Using this information, it’s possible to determine what the subset of a population is, what the subpopulation is for the particular drug and then describe the care plans.

Developments in big data capabilities are making different datasets far more compatible. Medical records and real-world evidence can now be taken and combined into a unified format that is understandable, rather than having to look into each one separately. This derives from data fusion techniques bringing information together to examine and investigate statistical models. This allows pharmaceutical companies to generate new insights and medical information that practitioners can act upon, a process that once would take a team of experts years to sift through.

The solutions are there, but it requires more than technology alone! Access to new data is changing the way that the pharmaceutical industry works, and it’s clear that research and development is reaping the rewards of this. Yet, it is important to remember that benefiting from the increasing pool of open information requires more than access to data. Data on its own has no value unless the key information needed can be extracted from it. Although systems such as machine learning and AI can help with this process, they are not ‘black box’ solutions that can work alone.

To generate the full value of machine learning requires teams with a broad range of skills who understand the business, the scientific domain, know where the new value is to be found and how to use the tools to extract valuable insights from the available data. That requires extensive expertise and people who can work together to find a common language with which to effectively communicate with partners both within and without the enterprise.

Data analytics techniques are needed to understand and identify causal relationships between the disease, patient attributes, environmental impact, race, upbringing and drug treatments, to name a few of the variables. Many pharmaceutical organisations will not have these capabilities and staff in-house, but this does not mean that they are excluded from utilising their data. By partnering with data science experts and analysts who can identify, understand, extract and effectively communicate the business value of this data, any organisation can make the most of the information it holds and be at the cutting edge of innovation in the industry.