There is significant anticipation that Artificial Intelligence or “AI” will revolutionize the paradigm and economics of drug discovery. The gamut of diseases that await next generation drug discovery cover the spectrum from Cancers to Parkinson’s to Alzheimer’s and other diseases. Drug discovery and the pace of innovation are neither scalable nor practically viable, where we require $2.7 B and 14 years on average to develop a single new drug (Study by US FDA and Tufts University). Less than 10 percent of potential drugs researched actually make it to market. The pharmaceutical industry today needs a much more optimized, scalable, and cost effective way of bring new drugs to market. AI is the technology that holds tremendous hope and expectation for providing this today.
Taking a simplified view, drug discovery involves working with massive amounts of very different kinds of data. And many facets of AI can help us work with and comprehend this “big data”. However, where does AI apply more specifically and what is the kind of promise it actually holds to specific problems ? We take a look, with respect to particular kinds of information processing and comprehension problems in drug discovery.
Machine Reading of Scientific Results
There are at least ten thousand new biomedical research papers added to the biomedical research literature pool every day ! It is cognitively impossible for researchers, or even large research groups to comprehend this information effectively with (just) manual discovery, exploration and assimilation. We are now seeing the emergence of AI driven machine reading software to sift through millions of research articles and create comprehensible and aggregated views for researchers. This not only includes extracted and assimilated facts from research papers but also more sophisticated synthesized constructs such as connections and hypothesis from multiple research papers and patents. AI technologies including deep natural language processing and knowledge extraction are being applied in making this happen.
Automated machine reading of biomedical research literature is one of the most promising near term AI applications that can speed up drug discovery. Note also that only a small fraction (10%) of candidate drugs researched actually make it to market. What we do have however is data and results on the candidate drugs that did not make it to market. Learnings from the data over the failed drugs is another area where AI driven machine reading can aid and enable.
In the current drug identification process, pharmaceutical companies have to screen vary large numbers of candidate (drug) molecules and determine potential winners by exhaustively testing each one of them. A fundamentally different approach is to “imagine” new molecules, aided by deep learning. New molecules are not generated randomly of course, but with properties known to be effective against diseases and without adverse affects. This is what is referred to as “in sicilo” drug generation where candidate drugs are generated in an informed manner.
University research labs are also experimenting with what is being termed as “inverse drug design”, which uses deep learning to find candidates for drug development. New molecular structures are generated by software that combine properties of existing drugs. Generative machine learning models are also being employed to “craft” new molecules with cancer like properties. Potential candidates are then evaluated against such generated molecules for effectiveness. Machine learning predictive models are also being used to predict drug therapeutics. Again, as opposed to actual chemical experimentation and testing, AI hold the potential to identify the therapeutic use categories of potential drugs.
A practical challenge that must be overcome for drug discovery and biomedical advancement in general is that of data sharing. No single organization holds (or ever will hold) all the various kinds of data that must be leveraged for drug discovery. Drug discovery involves information contained in patient medical records, research studies, clinical trials, medical images and also genomic (sequenced) data.
AI tools are required to syntactically and semantically harmonize information from such multiple repositories of data.
Medical Image Analysis
Key information that can bear upon drug discovery is locked in medical images. With digital pathology we have today very powerful molecular imaging capabilities for research. In diseases such as cancer there is key information within MRI scans (images) taken for patients and research subjects. Traditional image processing techniques are proving to be limited in the analysis of such medical images. Current work indicates deep learning based image analysis as a promising and scalable approach for automated analysis of biomedical images.
Teracrunch Drug Discovery AI Solutions
Automated machine reading for comprehending the vast and evolving body of biomedical research literature will significantly optimize and speed up current research towards drug discovery. TeraCrunch’s solutions in this area leverage its SocratezTM text mining platform which is a state of the art text analytics engine that can be customized for particular domains. Further, TeraCrunch data scientists have significant expertise and experience in custom machine reading solution development, especially for clients in the biomedical research domain.