Do machine learning algorithms require a systemic approach to validation and monitoring once they are deployed? Speaking during an online panel discussion on the potential unintended consequences of the application of artificial intelligence in healthcare, Peter Embí, M.D., M.S., president and CEO of the Regenstrief Institute, explained why he believes such a surveillance system is necessary.
Besides leading the Indianapolis-based Regenstrief Institute, Embi also serves as the associate dean for informatics and health services research at the Indiana University School of Medicine, the associate director at Indiana Clinical and Translational Sciences Institute and vice president for learning health systems at IU Health. He was speaking on a panel on “Ethics, Law and Unintended Consequences of AI in Health Care” during a three-day conference on AI hosted by Regenstrief.
Embi recalled that a little over a year ago he was in Portland, Ore., at a meeting of the Healthcare System Research Network (HCSRN) when the question came up about how unintended consequences of machine learning algorithms would be dealt with. “I said that something akin to pharmacovigilance, namely algorithmovigilance — the monitoring of computable algorithms for expected and unexpected effects, was going to be important,” he said. “Thinking about that further, we have come up with a proposed definition, which is the systematic monitoring of computable algorithms to develop and respond to expected or unexpected health effects. This is akin to pharmacovigilance in many ways for the monitoring of drug effects, and something that will become increasingly important in one form or another and that we need to work on as a community.”
In other parts of the Regenstrief meeting, presenters spoke about progress in deploying algorithms in health system settings. For instance, Nigam Shah, M.D., associate CIO for data science for Stanford Health Care, described his team’s work combining machine learning and prior knowledge in medical ontologies to enable the learning health system. His team runs the country’s first service to use aggregate clinical data at the bedside for decision making.
Embi said work like Dr. Shah’s in algorithm-driven and -supported healthcare decision-making is going to expand. “As with the introduction of any kind of tooling system, especially in a very complex environment like our healthcare delivery system is, we know there are going to be effects and unintended effects. If there is anything we can anticipate, it is that we won’t be able to anticipate everything that is going to happen as we do this. We need a learning system as we do this.”
Embi spoke of a need to think about the importance of monitoring the effects that these algorithms have, “not only when they are initially deployed in the populations where they are initially developed, tested and improved upon, but ultimately when they get deployed beyond that setting, so we can address a lot of the potential but also unintended effects.”
As healthcare executives think about how to do this, Embi suggested using an analogy from clinical trials. He described the four-phase process for moving pharmaceuticals and devices to FDA approval to ultimately releasing these products into market.
“And then critically important, though until recently undervalued and now a major focus of attention is the post-marketing surveillance that needs to include thousands of people,” Embi stressed, “because we know that as we deploy interventions into practice, even having done phase 3 testing on sometimes thousands of people, that is still a relatively small number for some of these uses and, of course, we know that there are adverse events we miss unless we track these things.”
In the field of pharmacology and drug development, pharmacovigilance emerged a few decades ago, has evolved to have its own methodology, and continues to grow. “Systematic surveillance approaches are important and growing, and increasingly we know we can use tools like our electronic health records and related data sources for even more systematic monitoring,” Embi said. “And of course there are projects like Sentinel and others that have been working on this for some time.”
The concept of pharmacovigilance has a science behind it, Embi noted. There is a definition for it: The science relating to the collection, detection, assessment, monitoring and prevention of adverse effects with pharmaceutical products. There are mechanisms and methods that have been put in place, including individual case safety reports and standards around how those should be reported. “I point these out to say that I think there is an analogy here to what we need to be thinking about with regard to the deployment of algorithms,” he said.
“Why do we need to do this? We know that there are, in fact, biases that exist in our data sources,” he explained. “Some of these are known and can be corrected for, and some are unknown. If we don’t monitor what happens when algorithms that are developed based on existing data are deployed into the real world and how they then interact with new data sources as they are deployed, we won’t be confident unless we are monitoring to see they are having the effects we intend.”
Caution about generalizability is another issue he mentioned. When an algorithm is based on a certain population, the degree to which that can generalize to the larger population can’t be assumed. “We need to monitor to make sure that in fact what works in one health system, environment or region, one population, can carry forward to others,” he said.
Finally, health systems should set up algorithm monitoring in order to promote trust. “It is important that those who are using these systems, whether providers or health systems and certainly patients and populations, can trust that these increasingly used algorithms are actually going to have the effect that is intended,” Embi said. “Unless we are monitoring these things, how can we really trust that they are having the intended effect? Being transparent about that is very important.”