How AI Is Opening Up the Use of Real-World Data for Clinical Research
Key Highlights
- TRIALSCOPE uses AI and real-world data to both simulate and validate clinical trial outcomes.
- The framework helps identify eligible patients from electronic health records, speeding up enrollment.
- The platform is already in use at Providence, supporting personalized therapies and advancing precision medicine initiatives.
Artificial intelligence is opening up new avenues for researchers to use real-world EHR data to help with clinical research. Healthcare Innovation recently interviewed Hoifung Poon, general manager of Real-World Evidence at Microsoft Research, and Carlo Bifulco, Ph.D., medical director of Cancer Genomics and Precision Oncology at the Providence Cancer Institute, about their work to overcome the challenges of traditional clinical trials, including low enrollment as well as high costs and failure rates.
Following a three-year study that assessed de-identified data from cancer patients across Providence, researchers from Providence and Microsoft developed TRIALSCOPE — an AI-powered framework designed to both simulate and validate clinical trial outcomes using real-world data, enabling researchers to reproduce results of large, historical clinical trials from observational patient data.
In a paper published in NEJM AI, the researchers explain that TRIALSCOPE was shown to “automatically curate high-quality structured patient data, expanding the dataset and incorporating key patient attributes only available in unstructured form. The framework reduces confounding in treatment effect estimation, generating comparable results with randomized controlled lung cancer trials. In addition, we demonstrate simulations of unconducted clinical trials — including a pancreatic cancer trial with varying eligibility criteria — using a suite of validation tests to ensure robustness.”
In a Providence news item, Brian Piening, Ph.D., director of research for Providence Genomics and co-author of the study, explains that this approach “de-risks clinical trials by using real-world data from patients who have already received treatments, allowing researchers to generate insights without exposing new patients to new medication. While the smaller, simulated datasets still require careful validation, TRIALSCOPE’s potential is invaluable, giving researchers a powerful new framework to help reduce the need for large initial participant pools and accelerating the path to more effective studies.”
One goal is to enhance trial efficiency and generalizability using advanced AI techniques. The researchers noted that this approach doesn’t replace validation but offers a way to reduce early risk and optimize trial planning before enrolling patients.
One potential application for TRIALSCOPE is to find new successful treatment strategies by mining compassionate use data, where individual patients gain access to experimental therapies when other options have failed.
In our interview, Bifulco began by explaining some of the challenges around traditional clinical trials. “Much of the progress that we make in medicine is clinical trial-mediated, so they are essential tools. But at the same time, I think only 4% or 5% of patients are offered clinical trials or enrolled in clinical trials, and there are major socio-economic and racial discrepancies in who gets enrolled in trials,” he said. “There’s another layer of problems, which has to do with the cost of the trials. They're expensive to run and they often take way too long because not enough patients are enrolled. Also, they have very high failure rates. So anything that helps us improve on all those dynamics is a step in the right direction.”
The researchers say that TRIALSCOPE has the potential to shorten the process of enrolling patients by finding patients based on data in their electronic medical records, overcoming limitations of manual curation.
The platform is already being used regularly by Providence researchers. For instance, Bifulco described how a Providence researcher is developing therapies that are like personalized T cells to help recognize mutations. Only a very few patients will be able to enroll in this, because they need to meet very specific criteria. “We were able to identify patients across Providence from different regions to enroll in this study through the platform,” he said. “I would say that the feedback from oncologists also is crucial, because there are real-world, logistical components that go into this beyond just the trial matching, so we really value their feedback.”
With an eye on advancing precision medicine, Microsoft’s Poon said one of the goals is to start to develop a virtual patient that can serve as essentially a digital twin to be able to look at the multi-modal longitudinal history and start to forecast how a disease like cancer might progress.
If you look at a traditional clinical trial, they lack the real-world patient distribution, he added. Fundamentally, clinical trials are trying to derive information that could be generalizable to the broader patient population. “I would say that the vision of this virtual patient is that we could actually incorporate all kinds of information about the patient, including multimodal information like imaging, all kinds of multi-omics and so forth, which currently has been underutilized in designing a patient trial and patient stratification,” Poon explained. “So far, there has been very little modeling of the fine-grained detail of the trajectory or the co-morbidities. The question is can we actually harness AI's capability to handle all that information, because traditionally that information is extremely unstructured, with lots of noise and biases. But that's exactly the sweet spot of Gen AI, so by constructing such a virtual patient, what we hope is to maximize our information gain from the real-world data, and then use that to start marching towards more and more of the virtual trial.”
About the Author

David Raths
David Raths is a Contributing Senior Editor for Healthcare Innovation, focusing on clinical informatics, learning health systems and value-based care transformation. He has been interviewing health system CIOs and CMIOs since 2006.
Follow him on Twitter @DavidRaths
