Artificial Intelligence

Using machine learning to better predict clinical trial outcomes

Leveraging AI to forecast clinical trial outcomes can help biomedical stakeholders gain better insight into the drug and device approval process.

Kara Baskin

Jul 2, 2019

Randomized clinical trials for new drugs and devices have always been a high-risk venture for a variety of stakeholders — investors, biopharma leaders, regulators, and, of course, patients and their families.

Now, MIT researchers are employing machine learning and statistical techniques to enhance data on clinical trial outcomes, allowing them to better handicap the drug and device approval process.

A new study, published in the debut issue of the Harvard Data Science Review, aims to provide more timely and accurate estimates of the risks of clinical trials. That data can help stakeholders manage their resources more efficiently, leading to fewer failures, faster drug approval times, a lower cost of capital, and more funding for developing new therapies.

“Everyone is affected by the risk of a drug failing in its clinical trial process,” said Andrew Lo, the study’s senior author and director of the MIT Laboratory for Financial Engineering. “With more accurate measures of the risk of drug and device development, we hope to encourage greater investment at this unique inflection point in biomedicine,” said Lo, who also serves as principal investigator at the MIT Computer Science and Artificial Intelligence Laboratory.

The research is part of Project ALPHA (Analytics for Life-sciences Professionals and Healthcare Advocates), a collaboration between the LFE and Informa Pharma Intelligence, a provider data of pharmaceutical and medtech data. The software used in the study will be made publicly available with an open-source license.

The study leverages the largest set of data to date to analyze the success or failure of clinical trials and uses more than 140 features — including trial outcome, trial status, trial accrual rates, duration, prior approval for another indication, and sponsor track record — to forecast clinical trial outcomes.

Lo and his research team — PhD candidates Kien Wei Siah and Chi Heem Wong —combined machine-learning techniques with statistical methods to account for missing data. A carefully designed statistical imputation technique makes it possible to estimate missing values along with other model parameters, such as the probability of success. The result: more accurate forecasting.

“It’s the difference between looking back at historical wins and losses to predict the outcome of a horse race versus handicapping the likely winner based on multiple factors like the horse’s pedigree, track record, temperament, the training regimen, the condition of the track, the jockey’s skill, and so on,” Lo said.

Additionally, a prior study published by the authors showcasing historical probabilities of success for clinical trials without any additional information — “unconditional” estimates — is updated quarterly and also available on the ALPHA website.