Over the past several years, Flatiron Health has built up an oncology-focused EHR network and a de-identified database from more than 280 U.S. academic and community cancer clinics in order to enable large-scale real-world research. Blythe Adamson, Ph.D., M.P.H., senior principal scientist at Flatiron, recently spoke with Healthcare Innovation about the company’s laser focus on oncology data curation and the huge difference machine learning models have made to the company’s curation and analytical capabilities.
According to her bio, Adamson was formerly the lead data scientist in the West Wing of the White House. She founded Infectious Economics in 2017 to provide thought leadership to policy makers and industry leaders on cost-effective strategies to prevent the transmission of viruses. At New York-based Flatiron, which is an independent affiliate of the Roche Group, her team pioneered deep learning language models for extraction of clinical details from EHR documents.
HCI: In 2016, I interviewed Flatiron co-founder Nat Turner about the company’s origins and he told me their initial goal was to create a big-data analytics offering for cancer centers, but then they also realized they had to own the EHR itself, and that became Flatiron’s OncoEMR. Could talk about your role with the company and the increasing importance of AI and the deep learning language models for the extraction of that data from the EHR?
Adamson: I've been at Flatiron for five years, and it really has changed over time. The capabilities of what's possible to do now with natural language processing, machine learning, and AI — those capabilities didn't exist five years ago when I joined the company or in 2016 when you first interviewed Nat. What attracted me to Flatiron at that time was that they had solved something that no one in the world had solved before, which was how to curate all of the unstructured data, the clinical notes, the radiology scans. The result of that was this technology-enabled abstraction — opening up these charts in a really standardized way following policies and procedures, and documenting all of those elements of clinical depth that are necessary to be able to get insights that you want about a population or the effectiveness of a treatment. It really hasn't been until the last few years, that advances in machine learning have made such huge leaps forward that it's possible to really start taking advantage of that to increase the speed and the scale of curation.
HCI: A lot of people in healthcare have seen this big increase in interest in large language models just in the past six to nine months, but is that something that you guys have been working on for much longer?
Adamson: Absolutely. There are so many different types of language models. I put out a paper with my team a few months ago describing at a high level the development of more than a dozen deep learning models to understand and read language and extract variables with similar accuracy to what our human clinical experts are doing. The first use case in the curation of EHR data was in cohort selection. We are helping our expert abstractors be more efficient in identifying patients, for example, who might have metastatic breast cancer. But now more of the use cases are actually in training these models to read and identify critical sentences and interpret the meaning of sentences in ways that are really similar to these clinical experts following sets of policies and procedures. An example of what that opens up: the biomarker status for an individual that may be trapped inside a complicated 10-page PDF of a genomic testing report. You might have five different vendors doing genomic sequencing, and all of their reports look different. They're all complicated and difficult to interpret. But one of the advantages that Flatiron has is a decade of golden data labelled by experts to train these models. That's a really big differentiator because there are a lot of brilliant machine learning engineers all over the world, and a lot of them can design sophisticated architectures. But if you don't have the data to train the model, you're not going to be able to get that high-quality performance. For the use cases that that we work with at Flatiron, which can range from commercial insights and academic-level insights to regulatory grade data, we have to keep a very close eye on our approaches for validation and monitoring bias creeping in over time.
HCI: Is there any effort to move some of the things in the unstructured fields into structured fields in the EHR?
Adamson: That's such a good question. When I first joined Flatiron, I thought, Oh, perfect. I'll just tell the software engineers on the EMR side, ‘can you just make sure that you force everyone to write ECOG performance status at every visit?’ But we serve providers, and making sure that they have an experience that's optimized for providing the highest quality care and spending as much face-to-face time with patients is the top priority. We don't manipulate the experience and force providers to document things because it would be good for research.
HCI: Does the research work you do require everyone to be on Flatiron’s OncoEMR or do you work with oncology practices that might be on some other EHR such as Epic or Cerner?
Adamson: We work with both. To be able to answer questions like comparing the effectiveness of two different treatments, we have to have a representative dataset from the U.S., so our datasets reflect 80 percent community oncology care and 20 percent academic hospital systems. Those academic systems may be using Epic or Cerner. We have to do continuous data integration with those hospital systems. And that 80/20 split is on purpose. It really represents where cancer types are seen because there are some cancer types that are only seen in academic hospitals, and there are many for which it's not necessary to go to an academic facility — you can receive high-quality care closer to home.
HCI: Do you think that we'll see the large EHR vendors like Epic and Cerner partner with AI companies to take advantage of these large language models to extract more insights from data and to automate processes fairly soon?
Adamson: We are already seeing it! Some of the most common use cases that we're seeing are in the lowest-risk applications. When we think about the benefits and risks of applying AI models, for example, in a clinical decision support tool, there may be different considerations. You would have to ensure that it's safe for deployment vs. an application that is gaining administrative efficiencies. There are a lot of burdensome parts of our healthcare system such as prior authorization. A lot of those opportunities to relieve the administrative burden are some of the easiest and safest places to begin to start gaining efficiencies.
HCI: Is health equity in oncology another area where Flatiron has an interest? I saw one of your company’s presentation titles from the recent ISPOR conference was about using the data to highlight measures of neighborhoods, structural racism and overall survival among patients with metastatic breast cancer. Another was about racialized economic segregation and inequities and survival among patients with multiple myeloma. Are you looking at the data to see where inequities exist in the system?
Adamson: Absolutely. We started by looking at the effect of Medicaid expansion on reducing racial inequities and timely cancer treatment. Then Flatiron hired a lead researcher for health equity. And since then, we've had a really robust amount of incredibly rigorous research. The life science companies that we partner with are responding to new guidance from the FDA for diversity planning. One of the benefits of having access to real-world data is that there are people in the real world who use these drugs who were not represented in the trial. They can be from historically marginalized communities or lower socioeconomic status, and they didn't have access to clinical trials. Sometimes it's in a post-marketing commitment or requirement or it could even be in a clinical trial planning phase to identify sites or more proactively recruit to be able to get a representative clinical trial. But at the end of the day, we as a society want to make sure that these drugs are safe and effective for everyone in the population for whom the products are indicated. For groups that are underrepresented in the trial, that's where we've done a lot of research to be able to understand how safely these drugs are working.
HCI: Flatiron recently announced a partnership with Sanofi about improving clinical data trial acquisition by digitally transferring data from the EHR to the clinical trial’s electronic data capture (EDC) system. Has that been a problem for health systems participating in clinical research for a while?
Adamson: I personally worked in clinical trials for years before joining Flatiron, and one of the resource-intensive, burdensome responsibilities in conducting trials is following all of the protocol-mandated documentation. There's been a lot of unnecessary duplication of documentation. A doctor might be caring for the patient on one computer, working in the EHR, recording all the information that's guiding their treatment decisions, and how well the patient is doing, and then moving to another computer and filling out everything that's needed to be collected for the clinical trial. And it's just absolutely not necessary. One of the barriers has just been the technology to be able to get rid of that duplication. Because Flatiron has EHR software that is able to document things with regulatory grade quality and curation, it gives us a huge step up in being able to develop these new products that are being deployed and used in prospective clinical trials.
HCI: Another recent announcement was that you were offering integrated real-world evidence solutions — end to end services. How does that differ from what the company has been doing previously?
Adamson: We're at an inflection point. We have this decade of experience delivering high-quality real-world datasets. Now we are expanding into delivering evidence. The first stage is curation of data and then the second stage is generating evidence and insights from it. That's really where a lot of my expertise comes into play, which is designing studies to answer research questions. Flatiron as a service now, from end to end, is offering that intellectual methodological partnership to understand the research question, support designing the study, using Flatiron’s real-world oncology data, to be able to conduct the experiment and be able to generate meaningful results. It is really fulfilling for me because as a scientist who's been generating evidence at Flatiron for years, it's not always something that we've made readily accessible to clients, even though they've been demanding it for a long time.
HCI: I write sometimes about large federated data networks like PCORnet. The researchers there are also doing real-world data studies. Do you think that's comparable to the kinds of things that Flatiron is doing? Are there more challenges working across multiple health systems?
Adamson: Well, I love PCORnet. It's a challenge to curate data with clinical depth when you're covering lots of disease areas. Flatiron has had the advantage of a decade of staying really focused on cancer, because it's hard to have both depth and breadth. We've seen this as a challenge for many other real-world data companies. One of the approaches Flatiron has taken is having oncologists on staff as part of the team designing our data models. That has allowed us to create products that are refreshed every single month, and that can be used to answer lots of different questions. I think that everyone's efforts to learn from real-world data are important, and it's really hard to do.
HCI: Well, could someone else become the Flatiron for diabetes or the Flatiron for heart disease? Or are there different challenges around those diseases that would make this approach more difficult?
Adamson: There are different challenges for every different disease area. The way that you would need to approach it to be able to get complete data would be different. One of the unique things about cancer is that people are seen by oncologists at cancer clinics. But it's difficult to understand people's experience who are living with HIV and cancer because they're seeing a different doctor for their HIV than for their cancer doctor. When we look at COVID, one of the reasons why the UK and Israel were much more agile ind being able to generate real-world evidence is because they have universal health care that includes primary care. If people aren't yet sick, they have a good understanding of who that denominator is in the general population. At Flatiron, we are able to answer questions about people that we know have cancer and are getting treated. We would not be fit for purpose to answer a question like: here's the entire U.S. population who might be about to develop cancer, because we don't have visibility into it. For every different disease area, the care navigation is different. To become the Flatiron for heart disease or Flatiron for diabetes, you really have to be able to have visibility into where those patients are during the part of their journey that you want to get insights into.