How UCSF Physician Execs Are Thinking About ChatGPT

There is a lot of industry buzz about the potential impact on healthcare of ChatGPT, the chatbot developed by OpenAI and launched in November 2022. During a recent Grand Rounds hosted by Robert Wachter, M.D., professor and chair of the Department of Medicine at the University of California at San Francisco (UCSF) Health, several executives at UCSF weighed in on where they think this technology is going and the roles of academic medical centers and health IT developers.

The conversation among the UCSF executives was wide-ranging and engaging, and is worth watching in full on YouTube, but I pulled out a summary of comments that might be particularly pertinent to our readership.

Wachter mentioned that ChatGPT is trained on large amounts of data from the internet, but not from Epic, not from patient charts. “So where is this going? What's the point of integration? Do we have to buy it? Does Epic? Do we buy a third-party tool? Does Epic buy a piece of ChatGPT? Does Microsoft buy Epic? How does this all come together?”

Sara Murray, M.D., associate professor of medicine in the Division of Hospital Medicine at UCSF Health and associate chief medical information officer, responded that this could go in a couple of different directions. “We could be co-developing this with companies like Microsoft, which I think is probably the quickest path to getting these models implemented,” she explained. “There are all these unsupervised models that are trained on the corpus of the internet, but then you can do supervised training on top of them or you can add information to them. These models could be trained to also read the patient's chart when it's responding, and also read my history of responding to patients so that it knows how I like to speak and my tone and that type of thing, and then incorporate it. I think the scalable solution is that a company like Epic collaborates with a company like Microsoft, but I don't know if that's the fastest path to implementation.”

Atul Butte, M.D., Ph.D., is the chief data scientist for UC Health, representing all six University of California academic medical centers. He also serves as the inaugural director of the Bakar Computational Health Sciences Institute and is the Priscilla Chan and Mark Zuckerberg Distinguished Professor of Pediatrics, Bioengineering and Therapeutic Sciences, and Epidemiology and Biostatistics.

He spoke about UCSF’s role as an academic medical center and its relationship with companies like Google, Microsoft and OpenAI. “We have always had this role of training the next generation of practitioners — nursing students, medical students, dental students,” Bute said. “We create and evolve a curriculum to train that next generation. We work with publishers on case-based materials and then we made sure our next generation passes those regulatory hurdles, whether it's USMLE, nursing exams, dental credentialing licensing, so then why aren't we as academic medical centers as driven when the next generation is going to be computational? Why are we just handing this off to companies.? I believe it should be our role as an academic medical center to teach this next generation the same way we teach humans in the next generation, right? We shouldn't be just handing this off to companies right now.”

Bute said UCSF’s responsibility has to go into “teaching ChatGPT-10, or whatever, about how to practice medicine. We’ve got to research the best curricula to use it to teach these computers. We’ve got to evaluate how well we're doing. We need to partner with companies to do this. We're not going to be able to do all this ourselves. But we also have to help these new learners overcome regulatory barriers like the FDA and the rest. We shouldn't just hand this responsibility off to companies, whether it's OpenAI or Microsoft or Google. We’ve got to work with them. It’s our responsibility.”

Today, more than 500 AI and machine learning tools and methods have been approved by the FDA, 150 just in radiology. Noting that this is a very active field, Bute said that you start to get a sense that “if we're not inventing some of these things, we're going to have to buy some of them at some point. So we should be on the innovation side as an academic medical center.”

Wachter noted that AI has been hyped as being around the corner for 30 years in healthcare, and basically flamed out. “We've talked about the AI winter, when the original tools that we're going to replace doctors’ diagnostic skills didn't go anywhere. They gave crazy answers and then the field kind of died. Was that simply a problem of the tools not being good enough and now this tool is good enough so it is going to take over? Or was it a problem that we were asking the wrong questions? Is it just a matter of now that we have a really spiffy tool, and it's going to win the day?”

Aaron Neinstein, M.D., is associate professor in the division of endocrinology and vice president of digital health at UCSF Health and senior director of the UCSF Center for Digital Health Innovation: He responded to Wachter’s question by saying, “I think it's a little bit of both. I think this is categorically different. The first time I sat down to use GPT, and others have probably had this experience, I had a moment of thinking the world has changed and will never be the same after this moment as it was before,” he said. “GPT-4 is around the corner and is supposed to be an order of magnitude more impressive. I think there's something about the technology here that is categorically different.” He noted that previously with AI you needed to spend a lot of effort building the right algorithm to answer the right question, putting in the right training data, designing the right workflow. “That may not need to happen here. We can pretty easily imagine the experience of opening my in-basket in the future, and sitting down and having responses trained on my last 10 years of in-box basket replies, and reading that patient's chart teeing up a draft of a reply for me, letting me edit it and sending it off. I think this is categorically different.”

Neinstein noted that one enterprising doctor posted a TicTok video using ChatGPT to write an appeal letter to an insurance company, trying to argue for an echocardiogram. “People are starting to think about the use cases that are taking up a lot of their time and where an AI helper could help offload work and do something faster for them,” he explained. “I took a long chunk of information that I had about hypoglycemia and hyperglycemia. This is a couple of pages of information, very dense, very textbook, and I asked ChatGPT to summarize it at a 6^th-grade reading level, and it spits out a very nice one-paragraph summary. Well, maybe that's a little bit too simple. Let's do that at a 9^th-grade reading level, and it makes it a little bit more complex. You can start to see the opportunity here to speed up tasks that we do that take up a lot of time and that can allow for a little bit more personalization and customization. I would still go in and edit this. But it's getting the work started for you.”

Neinstein added that if you look around at primary care right now, patients are not happy. Doctors are not happy. People are burning out. “They're spending a lot of their time on administration, bureaucracy, and paperwork,” he said. “I think at least in the early days, taking that work out of the system and giving our doctors, our nurses more time to spend with patients feels like it's going to be a really net-positive.”

Murray agreed. “I think everyone will be happier if doctors are spending less time doing tedious clerical work. I don't think anyone wants their doctor doing that work. I think we have to be careful, as we start thinking about implementing these as drafting tools, where I do think there's a lot of promise,” she warned. “Other industries have started doing this. CNET, for example, writes a lot of their articles with this technology now and everything is supposed to be reviewed by a person, but they're finding errors in these articles that clearly a human author wouldn't have made and they're not catching it. We just have to be careful as we start rolling these things out that we don't put too much trust in them, that we don't get sloppy, but I think overall it's going to reduce a lot of clerical work, and I think everyone will benefit.”

She also spoke about some early use cases that UCSF is looking at, such as streamlining literature reviews. “I think there's a lot of promise if we think about refining these models to focus on bodies of text that we trust, like the peer-reviewed literature. But what I'm most excited about is the opportunity for task automation and the potential to reduce clerical work for physicians.”

She gave an example of a MyChart message that was a shared with her by a GI colleague. The patient is concerned about fatty changes in her liver and hyperlipidemia that's related to a drug she's on called Xeljanz. And she'd also like a TB test for work. She put this de-identified information into ChatGPT and the response it generated reads like a thoughtful response, Murray said. “It lets her know that we ordered a lipid panel and a TB test as she requested. It asks her to schedule a follow-up appointment and discuss the results and potential alternative treatments and then it makes lifestyle suggestions. And then finally, it writes with a compassionate voice, which I think patients really appreciate,” she added. “So imagine you open up your in-basket, and all of your replies are pre-drafted by an algorithm that read the patient's chart when it's drafting the response. Now your job in replying is really to edit for accuracy, which could be a total game changer in terms of reducing in-basket burden and clerical work for physicians.”

Murray said that USCF is looking at piloting a tool called DAX (Dragon Ambient eXperience) from Nuance. “The idea is that you have this tool listening to all your clinical appointments, and it'll generate a note,” she said. “Those notes are actually reviewed for accuracy by a human at the company. Effectively it functions as an AI replacement for a scribe.” There are some challenges with this, Murray noted. It's in its relative infancy. There can be variable performance based on your specialty. There's a pretty substantial ramp-up period as the tool learns. “Currently, it's about the same cost as human scribes, which, as you know, can be tremendously expensive. But I'm optimistic because Nuance is actually now part of Microsoft, which has the exclusive license for GPT-3, so these tools are probably going to get dramatically better over the next couple of years. And hopefully the costs will come down, too.”