Algorithm Identifies Increase in Long Covid Burden

Research finds that among patients with COVID-19 across 58 U.S. hospitals, approximately 16% developed long COVID

Researchers used a novel “precision-phenotyping” algorithm developed at Mass General Brigham to identify long COVID in longitudinal electronic health records and found a greater prevalence than previously identified. 

In a paper published in JAMA Network Open, researchers say that the true toll of long COVID may be hidden from current surveillance systems that rely on capturing diagnostic codes. 

The investigators used an algorithm to comb through medical records of nearly 460,000 patients with COVID-19 across 58 U.S. hospitals, finding approximately 1-in-6, or roughly 16%, developed long COVID. These rates, which translate to more than 18 million Americans, are two-fold higher than current estimates and reflect an increasing prevalence of chronic conditions following COVID-19 infection. Results are published in JAMA Network Open.

“Over 10 million people with long COVID would go entirely undetected by the diagnostic code that health systems and policymakers rely on to track the disease burden,” said study corresponding author Hossein Estiri, Ph.D., a faculty member in the Mass General Brigham Department of Medicine, in a statement. “The figures we uncovered are almost certainly an undercount.”

Current diagnostic coding, including the ICD code U09.9 designated for post-COVID conditions, captures fewer than 7% of patients with long COVID.

Mass General Brigham’s algorithm was previously validated to identify cases of long COVID as a diagnosis of exclusion, which identifies conditions that appeared after COVID-19 infection and cannot be explained by preexisting conditions already in a patient's medical history.

Researchers analyzed electronic health records from 457,950 patients who had previously tested positive for COVID-19 across four U.S. regions: New England, Southeast Texas, Southern California and Western Pennsylvania. They identified long COVID in 16.3% of patients overall, with rates ranging from 13.6% to 22.7% across regions. Across the full study cohort, 14.5% of COVID-19 patients (66,587 individuals) developed chronic conditions requiring sustained clinical care. The study also uncovered regional variations of long COVID clinical manifestations, such as dramatically different rates of prediabetes – an emerging sequalae of long COVID – across various parts of the U.S.

Contrary to the assumption that long COVID is a legacy of early waves of the pandemic, the researchers also found that cumulative prevalence continued to increase through mid-2024 across all regions studied. 

Statistical modeling showed significant quarterly increases in New England, Southern California and Western Pennsylvania, with trends pointing to continued growth over the next decade if current patterns persist.

“This work demonstrates how longitudinal clinical data in a health system can be structured and analyzed to support more consistent identification of complex post-viral conditions,” said Shawn Murphy, M.D., Ph.D., study co-author, in a statement. “There is significant potential for clinical AI when it is designed for public health and integrated across real-world care settings,” added Murphy, who is chief research information officer for the University of Washington.

The researchers note that their findings do not include undocumented infections, which have become the majority since widespread testing ended, and exclude patients without longitudinal medical records. These limitations suggest the overall disease toll of long COVID may be even higher.

“These patients are not absent from clinical care; they are absent from the diagnostic code that would identify them as long COVID patients,” said lead study author Jiazi Tian, M.S.c., a data scientist in the Clinical Augmented Intelligence Group at Mass General Brigham, in a statement. “The cardiologist seeing new dysautonomia, the endocrinologist seeing new metabolic disease, the neurologist seeing unexplained cognitive complaints — some of these presentations are long COVID arriving without the label that would connect them to a COVID-19 infection. This study demonstrates how hospitals can leverage AI to help fill surveillance gaps that public health agencies are no longer tracking.”

 

About the Author

David Raths

David Raths

David Raths is a Contributing Senior Editor for Healthcare Innovation, focusing on clinical informatics, learning health systems and value-based care transformation. He has been interviewing health system CIOs and CMIOs since 2006.

 Follow him on Twitter @DavidRaths

Sign up for our eNewsletters
Get the latest news and updates