Researchers: AI-Generated Clinical Summaries Need Finetuning

Feb. 6, 2024
LLMs summarizing clinical notes could introduce errors with effects on clinician decisions

Clinical applications of generative artificial intelligence (AI) and Large Language Models (LLM) are progressing; LLM-generated summaries can provide benefits and could replace many future EHR (Electronic Health Record) interactions. However, according to a team of researchers, LLMs summarizing clinical notes, medications, and other patient information lack the US Food and Drug Administration (FDA) oversight, which they see as a problem.

In a viewpoint article for the JAMA Network, published online on Jan. 29, Katherine E. Goodman, JD., Ph.D., Paul H. Yi, MD., and Daniel J. Morgan, MD., MS., wrote, “Simpler clinical documentation tools…create LLM-generated summaries from audio-recorded patient encounters. More sophisticated decision-support LLMs are under development that can summarize patient information from across the electronic health record (EHR). For example, LLMs could summarize a patient’s recent visit notes and laboratory results to create an up-to-date clinical “snapshot” before an appointment.”

Without standards for LLM-generated summaries, there’s a potential for patient harm, the article’s authors write. “Variations in summary length, organization, and tone could all nudge clinician interpretations and subsequent decisions either intentionally or unintentionally,” Goodman, Yi, and Morgan argued. The reason for summaries varying is that LLMs are probabilistic, and there is no correct response on which data to include and how to order it. Slight variations between prompts can impact the outputs. The JAMA network provided an example of a radiography report with notes of chills and a cough. The summary, in this instance, added the term “fever”. This added word completes an illness script and could affect the clinician’s diagnosis and recommended course of treatment.

The writers of the JAMA Network report, “[F]DA final guidance for clinical decision support software…provides an unintentional “roadmap” for how LLMs could avoid FDA regulation. Even LLMs performing sophisticated summarization tasks would not clearly qualify as devices because they provide general language-based outputs rather than specific predictions or numeric estimates of disease. With careful implementation, we expect that many LLMs summarizing clinical data could meet device-exemption criteria.”

The article’s authors recommend regulatory clarifications by the FDA, comprehensive standards, and clinical testing of LLM-generated summaries.

Sponsored Recommendations

Northeast Georgia Health System: Scaling Digital Transformation in a Competitive Market

Find out how Northeast Georgia Health System (NGHS) enabled digital access to achieve new patient acquisition goals in Georgia's highly competitive healthcare market.

2023 Care Access Benchmark Report for Healthcare Organizations

To manage growing consumer expectations and shrinking staff resources, forward-thinking healthcare organizations have adopted digital strategies, but recent research shows that...

Increase ROI Through AI: Unlocking Scarce Capacity & Staffing

Unlock the potential of AI to optimize capacity and staffing in healthcare. Join us on February 27th to discover how innovative AI-driven solutions can revolutionize operations...

Boosting Marketing Efficiency: A Community Healthcare Provider’s Success Story

Explore the transformative impact of data-driven insights on Baptist Health's marketing strategies. Dive into this comprehensive case study to uncover the value of leveraging ...