At UPMC, A Push Into the Thickets of Data Governance—On a Grand Scale
Vivek Reddy, M.D., Chief Medical Information Officer, UPMC Health Services Division, and Assistant Professor of Neurology at the University of Pittsburgh:
Much progress is taking place in the area of data governance at the vast, 21-hospital University of Pittsburgh Medical Center (UPMC) Health System. Over three years ago, leaders at that huge integrated health system made the corporate decision to strategize forward on data governance—to drill down and to build up, quite seriously, as the organization moved forward on population health management and in a number of other areas.
Recently, one of UPMC’s senior leaders spoke with HCI Editor-in-Chief Mark Hagland about the forward evolution of data governance in his organization. Vivek Reddy, M.D., is chief medical information officer of the UPMC Health Services Division, which manages health IT for all outpatient services and owned medical groups under the UPMC umbrella. He is also an assistant professor of neurology at the University of Pittsburgh. Below are excerpts from their interview.
You’ve been moving forward in so many areas recently. What has been particularly noteworthy in UPMC’s efforts of late?
One area in which we’ve been spending a considerable portion of our energy as we ramp up our analytics capabilities has been around data governance. So we kicked off our big analytics program—we’re now in year three. And we simultaneously started a data governance program, a formal program at UPMC where key constituents form a data governance council and we formalize and standardize the use of data across the organization—that comes down to such mundane issues as, data definitions, where we send data, everything from simpler decisions to how to send data to different places, to more complex definitions of different disease sates?
This program was set up to help our organization embrace the concept of data as an asset, rather than just data being something sitting in databases.
How has that all evolved forward?
When we kicked off this program, we didn’t really know what was going to happen, or what it would be like. And we spent a lot of our early time focused on the relatively low-hanging fruit, such as, what coding system to use around diagnoses, or what’s the best source system around medications or lab data? So we went after low-hanging fruit first.
Now in year three, doing analysis of the data we’ve got pumping in the warehouse, the most challenging aspect is now moving into the clinical realm, and what we have found is that our understanding and level of confidence in the data we’ve acquired through the years from a clinical standpoint is suspect at best; because our electronic record systems were built for physician efficiency and billing purposes, we ended up with note bloat a lot of the time, and ended up having a lot of trouble defining the prevalence of clinical conditions, because people are using shorthand or people aren’t putting information in the correct location in the EHR. So this is a huge problem for folks trying to embark on personalized medicine, or any endeavor where you’re trying to connect genomics data to clinical data—for comparative effectiveness. And the fact is that you’re not 100-percent sure that the data you have is truly accurate. So our data governance program which is somewhat unique, is not an IT-driven project. We’ve taken key clinical leaders and made them own the definitions of certain diseases or data elements, and when they’ve taken certain elements, it’s required them to know, OK, which system will be my gold standard, which my backup, etc.? So if I use claims data from my health plan, that will be my third-ranking set of data, for example. So what this does is that it forces our organization to own the quality of their data and where that data is coming from and can feed into any business or clinical processes, to feed the fidelity of that data.
So it’s a really important paradigm shift. If folks are going to get into the space of using data for real, you really have to understand how data is generated, and for which systems. When people talk about data governance or quality, it’s traditionally been thought of as an IT issue. But now, our clinical and business colleagues actually own the data and will define what to do with it going forward?
It sounds as though what you and your colleagues at UPMC are doing is unique at the moment, in terms of the breadth and depth of the data governance strategy?
Yes, a lot of organizations start with this model in mind, but then to speed up the process or make it faster to do analysis, we have found that a lot of organizations are cutting corners or looking for the “easy button,” and saying, we’ll deal with gaps in data will be dealt with as kind of “rounding errors.” But we’ve said this isn’t minor, and owning data and data definitions, the level of importance we’re assigning to those, is significant. And a lot of this requires specialized knowledge, and really, the only folks who could do this well will be those who live and breathe the data. And we’re allowing for different data definitions, but we’re making it so that in every different report, we’re defining where we got the data and how we’re defining certain populations or types of data.
It’s been an interesting challenge, because when most organizations kick off an effort like this, this idealistic notion is something where it begins, but a lot of bad habits creep in. And as we enter our fourth year, we’re really focusing on this to moving forward.
What advice might you have for healthcare and healthcare IT leaders in other organizations, as they begin to move forward in this very important area around data governance?
I really think the corner-cutting method, especially since we’re dealing with healthcare data, is one that’s easy to do, when one is looking to get to a finish line. But the reality is that these shortcuts will actually create a significant amount of rework and will reduce the amount of confidence you end up with. And so I’ve said you can’t put up a dashboard unless you know where the data is coming from and how it’s defined. And you start to realize that if you don’t define how you’re using red, green and yellow, you’ll start to lose the trust of the end-users using the data.