Researchers: Hospitals Need to Improve Accuracy, Bias Measurement in AI
A team of researchers has engaged in a detailed analysis of artificial intelligence (AI) and predictive analytics models, and has concluded that, while some hospital and health system leaders are working to ensure model accuracy and to guard against bias, many have not engaged in rigorous work in that area.
The researchers—Paige Nong, Julia Adler-Milstein, Nate C. Apathy, A. Jay Holmgren, and Jordan Everson, write in detail in the January issue of Health Affairs about their analysis of the procedures and protocols that hospital and health system leaders are following in implementing Ai in their patient care organizations. The abstract of the article, entitled “Current Use and Evaluation of Artificial Intelligence And Predictive Models In US Hospitals,” begins thus: “Effective evaluation and governance of predictive models used in healthcare, particularly those driven by artificial intelligence (AI) and machine learning, are needed to ensure that models are fair, appropriate, valid, effective, and safe, or FAVES. We analyzed data from the 2023 American Hospital Association Annual Survey Information Technology Supplement,” they write, “to identify how AI and predictive models are used and evaluated for accuracy and bias in hospitals. Hospitals use AI and predictive models to predict health trajectories or risks for inpatients, identify high-risk outpatients to inform follow-up care, monitor health, recommend treatments, simplify or automate billing procedures, and facilitate scheduling.”
Per all that, the researchers report that “We found that 65 percent of US hospitals used predictive models, and 79 percent of those used models from their electronic health record developer. Sixty-one percent of hospitals that used models evaluated them for accuracy using data from their health system (local evaluation), but only 44 percent reported local evaluation for bias.”
That issue is extremely important, the researchers note, writing that, in efforts to reach a strong level of “FAVES” (fair, appropriate, valid, effective, and safe), “Bias is a central focus, as an increasing number of empirical analyses reveal racial and other forms of bias in algorithms that perpetuate or exacerbate inequities by building barriers to needed care, perpetuating harmful race-based medicine, or underrepresenting patient populations.” But “The accuracy of models is also a concern, given variations in healthcare practices, data capture, and patient populations.” Indeed, they write, “AI trained on certain data sets might not be effective or valuable when deployed in settings that differ from the training data.”
Indeed, the team’s analysis of AI model implementation found that, “Although more hospitals reported locally evaluating for accuracy than for bias, more than one-third did not provide accuracy assurance, either.” And, concerningly, they note, 56 percent of reporting hospitals did not report evaluating for bias.
As the article’s authors note, “Using 2023 national data on US hospitals, we identified how Ai and predictive models are being used, applied, and evaluated. We found that although most US hospitals reported using predictive models, fewer than half of them systematically evaluated models for bias, and just two-thirds evaluated them for accuracy. Our findings,” they conclude, “point to the need for additional guidance and resources for independent and underresourced hospitals to ensure the use of accurate and unbiased AI for patients regardless of where they receive care. Similarly, the growth and broad impact of providers’ self-developed models that are currently outside the scope of federal regulation could warrant additional consideration.”
That said, the researchers are not advocating for major, potentially inhibiting regulations; instead, they write, “Targeted policies that support the availability of FAVES AI and bolster capacity to evaluate and govern AI may be most effective, including interventions designed to connected underresourced hospitals to evaluative capacity.”