Researchers: AI-Generated Clinical Summaries Need Finetuning

Feb. 6, 2024
LLMs summarizing clinical notes could introduce errors with effects on clinician decisions

Clinical applications of generative artificial intelligence (AI) and Large Language Models (LLM) are progressing; LLM-generated summaries can provide benefits and could replace many future EHR (Electronic Health Record) interactions. However, according to a team of researchers, LLMs summarizing clinical notes, medications, and other patient information lack the US Food and Drug Administration (FDA) oversight, which they see as a problem.

In a viewpoint article for the JAMA Network, published online on Jan. 29, Katherine E. Goodman, JD., Ph.D., Paul H. Yi, MD., and Daniel J. Morgan, MD., MS., wrote, “Simpler clinical documentation tools…create LLM-generated summaries from audio-recorded patient encounters. More sophisticated decision-support LLMs are under development that can summarize patient information from across the electronic health record (EHR). For example, LLMs could summarize a patient’s recent visit notes and laboratory results to create an up-to-date clinical “snapshot” before an appointment.”

Without standards for LLM-generated summaries, there’s a potential for patient harm, the article’s authors write. “Variations in summary length, organization, and tone could all nudge clinician interpretations and subsequent decisions either intentionally or unintentionally,” Goodman, Yi, and Morgan argued. The reason for summaries varying is that LLMs are probabilistic, and there is no correct response on which data to include and how to order it. Slight variations between prompts can impact the outputs. The JAMA network provided an example of a radiography report with notes of chills and a cough. The summary, in this instance, added the term “fever”. This added word completes an illness script and could affect the clinician’s diagnosis and recommended course of treatment.

The writers of the JAMA Network report, “[F]DA final guidance for clinical decision support software…provides an unintentional “roadmap” for how LLMs could avoid FDA regulation. Even LLMs performing sophisticated summarization tasks would not clearly qualify as devices because they provide general language-based outputs rather than specific predictions or numeric estimates of disease. With careful implementation, we expect that many LLMs summarizing clinical data could meet device-exemption criteria.”

The article’s authors recommend regulatory clarifications by the FDA, comprehensive standards, and clinical testing of LLM-generated summaries.

Sponsored Recommendations

The Race to Replace POTS Lines: Keeping Your People and Facilities Safe

Don't wait until it's too late—join our webinar to learn how healthcare organizations are racing to replace obsolete POTS lines, ensuring compliance, reducing liability, and maintaining...

Transform Care Team Operations & Enhance Patient Care

Discover how to overcome key challenges and enhance patient care in our upcoming webinar on September 26. Learn how innovative technologies and strategies can transform care team...

Prior Authorization in Healthcare: Why Now?

Prepare your organization for the CMS 2027 mandate on prior authorization via API. Join our webinar to explore investment insights, real-time data exchange, and the benefits of...

Securing Remote Radiology with the Zero Trust Exchange

Discover how the Zero Trust Exchange is transforming remote radiology security. This video delves into innovative solutions that protect sensitive patient data, ensuring robust...