During an Oct. 25 National Academy of Medicine Workshop on Generative AI and Large Language Models in Health and Medicine, health system executives and other stakeholders spoke about the governance, regulation and deployment issues they are grappling with.
“We're transitioning from AI as a tool to AI as an assistant. But we have to keep in mind the future of AI as a colleague, and how we regulate and consider the different applications will change over time, said Vincent Liu, M.D., M.S., a senior research scientist at Kaiser Permanente’s Northern California Division of Research.
In the tool stage, machine learning can be relentless in achieving one goal, but that goal can be quite limited, and it is more easily controlled, Liu said. “Because we know all the inputs that go in, we have some expectation about the outputs that come out. And that's where we are today in the industry. We are using tools for evaluating X-rays or predicting deterioration or other applications, and our focus is on teaching our providers how to use that tool correctly.”
“You have to think about the use case and the benefits and drawbacks of that specific tool. But I think what we're seeing now is unlocking the capabilities of AI, especially generative AI as potentially the most fabulous assistant you've ever had — your reference librarian, your medical resident, your translator, your patient liaison, your scribe — all of those things,” Liu said. “Now we are interacting with these tools as assistants to begin to understand how to direct them. Can we engineer the prompts or the way that we interact to be maximally efficient for us in the future? I think we have to be cognizant that there's a future where AI is a colleague, and that is actually a kind of a ground-shifting thought.”
Nigam Shah, M.B.B.S., Ph.D., professor of Medicine at Stanford University, and chief data scientist for Stanford Health Care, said that when thinking about the potential for generative AI, we have to consider why some previous attempts to deploy earlier AI in healthcare had fallen short.
He said that there's an interplay between machine learning models, policies and capacities to take action, and the net benefit of the actions themselves. Good AI-guided work happens as an interplay of these three things.
Shah said there have been hundreds of predictive models developed for population health, readmissions predictions, and sepsis predictions. “Often we don't have the policies and the work capacity designs set up correctly to achieve the promised usefulness that we could have gotten,” he said. “The risk I see is that we didn't get it right for the traditional or regular AI. What are we doing as a community to ensure that our response to generative AI will be better? And I'm part of CHAI — the Coalition for Health AI. We are talking about having a place, an assurance lab, so to speak, where we can analyze performance of these models in light of work capacity constraints, hopefully via simulation, and there are data available to perform such analyses. Right now, we find ourselves in a situation where the big tech companies have the models, the large health systems have the data, and the researchers are quote, unquote, locked out. We need to create a safe place, this assurance lab, where we can analyze this interplay amongst models, work capacity and policies. The unique risk here is we don't study this interplay, particularly for generative AI, which is just going to make things faster and harder to contain.”
Gil Alterovitz, Ph.D., Department of Veterans Affairs’ chief AI officer and director of the VA National Artificial Intelligence Institute, described how the VA set out several years ago to create its own AI strategy, one of the first federal agencies to do that.
“We brought together over 20 offices,” he said. “The VA has a number of different offices that leverage AI or think about AI in different ways, and we helped bring them together by creating a task force and an AI working group. We've been looking at doing things proactively before they're perhaps required to be done. We also created a VA agency-wide trustworthy AI framework, and created a list of AI use cases.”
The VA also set up a collaborative, shared AI governance structure. “That way, we're able to understand the use cases as they develop from the beginning,” Alterovitz said. “We're able to catalog those use cases and then evaluate them as needed. We have these AI oversight committees at different medical centers that can scale up and feed into the national level.”
In addition to the AI oversight committees, for research the VA is leveraging existing institutional review board structures in reviewing AI modules. One of the keys, Alterovitz said, is helping people figure out what to ask. “We find that one of the biggest challenges is actually knowing what questions to ask in the first place. Once they know the questions, they can begin gathering subject matter experts to help on that. Through this process, we've actually found industry trials and other cases where there was either lack of transparency or issues in data. Some of these are not even necessarily AI issues. They may be privacy, security, or other kinds of issues. But sometimes there's a need to have this checklist to know what to look through, so at the VA, we've developed that for these different parts. of the organization, whether it be research, that's the IRB, or more operational use cases, the AI oversight committees.
“There are some amazing ideas for generative AI, and we need to be very clear about them for clinicians and workflow,” said Jackie Gerhart, M.D., a family medicine physician and clinical informaticist at Epic. “Chart summaries are one of the big projects we're working on right now in terms of taking an entire clinical chart and trying to distill it down not just to the key points in general or for the patient, but specifically for each type of user and for each type of instance.”
Another use case, she said, is called Messaging Made Easy, which involves draft messages for inbox responses for patients. “We've seen a huge 150 percent increase in patient messages during the pandemic and this is going to really help our clinicians be able to answer their patient questions more quickly,” Gerhart said. After the draft message is created, the clinician can edit the note, she said.
Steven Waldren, M.D., M.S., chief medical informatics officer at the American Academy of Family Physicians, said that in research about using AI for documentation, AAFP saw over a 70 percent reduction in the documentation time for doctors leveraging an AI solution that wasn't using generative AI. “It now has generative technology using this more ambient technology and that ramps it up even further,” he said.
“One of the other big challenges is that family doctors have that cognitive burden of seeing a patient comes in with multiple problems. They have a 15-minute visit and data everywhere. How do they pull that all together in one place that makes it easy? We've seen another AI solution that we've been working with the creates that problem-oriented summary, and for those patients that decreases the physician time by 60 percent.”
“There are some areas that we can really be focused on that are lower risk and will make a huge impact,” Waldren said. “I think that will drive great adoption in the physician community of these types of solutions and pave the way for other things to go forward.”