In the last 4–6 years, companies have been experimenting with new approaches that utilise the latest in artificial intelligence (AI) and machine learning technologies to identify new molecules and validate new drugs faster and more efficiently. Egor Kobelev, VP of Healthcare and Life Sciences at DataArt, examines some of the ways these breakthroughs are transforming drug discovery
Companies such as Schrodinger have been using computational methods to discover novel targets and drugs for almost 30 years. Schrodinger’s approach focuses mainly on proteins. Using a database of protein structures, they develop compounds based on biophysics, machine learning and the binding properties of molecular docking.
For a while, this was the only model that worked. Schrodinger has built a successful business during the last three decades, offering software and services meant to help discover small molecules and optimise them for potential targets. However, times have changed. Schrodinger has had to adjust its business model to include the development of therapeutics, along with its more traditional molecule discovery services.
This move signals a broader industry shift during the last 4–6 years. The community has been wondering how far we can push the drug design process, how much faster and more efficiently we can identify new molecules and validate new drugs. Below are just a few of the ways in which breakthroughs in artificial intelligence (AI) and machine learning are revolutionising drug discovery.
With AI’s growing influence in the tech world, industry experts and researchers have been trying to use the burgeoning technology to solve practically any issue they come across. This approach makes sense for some industries more than others, especially those with large amounts of data. The drug industry fits that bill in spades.
Some companies, such as Atomwise and Exscientia, are using AI to revolutionise the tried-and-true approaches. Essentially, they are working on discovering small molecules that can lead to the creation of drugs, based on pattern recognition. Although their concept is similar, the mathematics used to drive their discoveries is very different … and they lean heavily on machine learning.
Younger companies such as Cyclica, Cloud Pharmaceuticals and E-Therapeutics have waded into the field more recently, and new start-ups are popping up every year. The market for this new approach to some older concepts and ideas is growing, and it’s because these new methods are finally producing practical results!
Researchers at Johns Hopkins University and In-Silico Medicine have recently published a paper in Nature Biotechnology in which they claimed that they were able to identify a new compound for a particular target within 3 weeks using neural networks. Although the merit of these outcomes was debated, the idea of drastically cutting down the 5–8-year drug discovery process is incredibly attractive.
During the last few decades, drug companies and academics have been collecting data. Throughout their research, they have accumulated more information on molecules and compounds than any team of humans could ever sift through with a reasonable level of efficiency. And as measuring capabilities increase, data collection is only getting more sophisticated and complex. For AI researchers, that is a potential goldmine.
With the newer deep learning AI techniques developed during the past few years, analysing this information could take a few days, rather than years. Potentially, companies could look at publicly available databases of compound binding, such as drug databases, and then, with an understanding of the patterns of safety and efficacy for millions of such compounds, create their own predictive algorithms using neural networks or other machine learning tools.
The beauty of this approach is that it’s not terribly complicated. With access to large databases that have been built through academic research, and combining those with machine learning approaches to match patterns with the desired outcome, a company could ultimately deliver a new and interesting molecular lead. As long as the right database exists to train the machine learning models, putting this method into action could be both simple and fruitful.
Furthermore, having that database of structures linked to the typical outcome metrics, or measures of how these compounds perform in relation to a specific disease such as cancer, could result in more targeted drug design and discovery. In this technique’s infancy, the scope may be restricted to one disease, which, owing to the sheer volume of research, would likely be cancer. But it doesn’t have to stop there.
Only around 1500 drugs have been approved by the US FDA, but hundreds of thousands, if not millions, of compounds, have been tested in clinical trials and preclinical experiments. This abundance of data, which may have previously been seen as not very useful, will now be more important than ever. So far, this data has yet to be repurposed; but, with the power of AI, it could potentially contain the key to the next generation of pharmaceuticals.
Generally, a large amount of the pushback against using machine learning to discover new drugs or novel compounds stems from a romanticised idea of human expertise — or, to put a finer point on it, that no collection of code could match the expertise that comes from decades of medical research experience. That point is well made, and it holds water — to a point.
Yes, AI systems have yet to match the sophisticated ability of a team of human experts. These systems are fallible. They can make mistakes or miss some things. But, so are humans. Humans are inherently imperfect and mistakes will inevitably be made. At the end of the day, a compound is either useful or not … and whether that compound was discovered by a team of human experts or a sophisticated computational model is immaterial. So, the question to be answered is as follows: which method is more efficient?
Although both humans and machines make mistakes, the former often makes random mistakes while the latter makes systematic ones. The beauty of systematic mistakes is that they can potentially be fixed, and the system could become more efficient with time. Measures can be put in place to reduce human error, but that’s not the same as changing a line of code or honing an algorithm. Ultimately, what matters is the method that offers the most usable compounds.
In this new AI-driven future, the ideal drug discovery and design team can be relatively small. Some teams that exist now are as small as several people, but they are extremely focused and specialised. These small outfits could work from a large data set, and they would not necessarily need a software development team.
Instead, they would need two main components: experts in training and building statistical models through machine learning, regression or any other method, and experts in structural biology and biochemistry. The former may seem obvious, but the latter, albeit equally, if not more important, is often overlooked or undervalued.
Ultimately, the biological experts will help connect the discoveries made by the algorithms with actionable results, blending the pros of both machine and human expertise, and demonstrating that the discovery methodology is uncovering meaningful outcomes.
If the integration of AI into the drug discovery and design process works, it can have incredible disruptive effects, both positive and negative. On the one hand, it could disrupt entire segments of the industry, leaving many chemoinformatics employees out of jobs.
On the other hand, cutting down discovery from years to months will have an almost incalculable effect on the larger pharma environment. An industry that justifies the high price of their product by the lengthy and costly research and development phase will be empowered to make more drugs quickly and affordably. Even with the possibilities for negative disruption, that seems like a gamble worth taking.