Study: Sophisticated AI Tools Can Detect Chest X-Ray Findings As Well As Radiologists Can

Dec. 3, 2019
A new study published in the journal Radiology finds that a sophisticated type of artificial intelligence can detect clinically meaningful chest X-ray findings as effectively as can experienced radiologists

Even as RSNA19 moves forward at Chicago’s McCormick Place Convention Center, the results of research around the application of artificial intelligence in radiology practice are being announced this week. And the latest published study might be raising a goodly number of eyebrows in the radiology world.

A press release published Tuesday by the Oak Brook, Ill.-based Radiological Society of North America, sponsor of the annual RSNA Conference, reported this: “A sophisticated type of artificial intelligence (AI) can detect clinically meaningful chest X-ray findings as effectively as experienced radiologists, according to a study published in the journal Radiology. Researchers said their findings, based on a type of AI called deep learning, could provide a valuable resource for the future development of AI chest radiography models.”

Study co-author Shravya Shetty, an engineering lead at Google Health in Palo Alto California, said this, in a statement contained in the press release: "We've found that there is a lot of subjectivity in chest X-ray interpretation. Significant inter-reader variability and suboptimal sensitivity for the detection of important clinical findings can limit its effectiveness,” Shetty said.

The press release went on to say that “Deep learning, a sophisticated type of AI in which the computer can be trained to recognize subtle patterns, has the potential to improve chest X-ray interpretation, but it too has limitations. For instance, results derived from one group of patients cannot always be generalized to the population at large. Researchers at Google Health developed deep learning models for chest X-ray interpretation that overcome some of these limitations. They used two large datasets to develop, train and test the models. The first dataset consisted of more than 750,000 images from five hospitals in India, while the second set included 112,120 images made publicly available by the National Institutes of Health (NIH). A panel of radiologists convened to create the reference standards for certain abnormalities visible on chest X-rays used to train the models,” it noted.

And the press release quoted Daniel Tse, M.D., product manager at Google Health, as stating that "Chest X-ray interpretation is often a qualitative assessment, which is problematic from deep learning standpoint. By using a large, diverse set of chest X-ray data and panel-based adjudication, we were able to produce more reliable evaluation for the models,” Dr. Tse said.

The press release went on to note that “Tests of the deep learning models showed that they performed on par with radiologists in detecting four findings on frontal chest X-rays, including fractures, nodules or masses, opacity (an abnormal appearance on X-rays often indicative of disease) and pneumothorax (the presence of air or gas in the cavity between the lungs and the chest wall). Radiologist adjudication led to increased expert consensus of the labels used for model tuning and performance evaluation. The overall consensus increased from just over 41 percent after the initial read to more than almost 97 percent after adjudication.”

What’s more, it added, “The rigorous model evaluation techniques have advantages over existing methods, researchers said. By beginning with a broad, hospital-based clinical image set, and then sampling a diverse set of cases and reporting population adjusted metrics, the results are more representative and comparable. Additionally, radiologist adjudication provides a reference standard that can be both more sensitive and more consistent than other methods.”

"We believe the data sampling used in this work helps to more accurately represent the incidence for these conditions," Dr. Tse said. "Moving forward, deep learning can provide a useful resource to facilitate the continued development of clinically useful AI models for chest radiography."

The research team has made the expert-adjudicated labels for thousands of NIH images available for use by other researchers at the following link:

"The NIH database is a very important resource, but the current labels are noisy, and this makes it hard to interpret the results published on this data," Shetty said. "We hope that the release of our labels will help further research in this field."