On Monday morning, industry leaders shared a wide range of perspectives on the first panel discussion of the day at the Health IT Summit in Cleveland, sponsored by the Institute for Health Technology Transformation (iHT2--a sister organization to Healthcare Informatics under the Vendome Group, LLC corporate umbrella). But what all the panelists in the session "It's All About the Data" agreed on was this: data stewardship, data management, and data governance are going to become increasingly urgent concerns, as patient care organizations move into a new era, becoming data-driven enterprises.
Don Reichert, vice president and CIO of MetroHealth Systems (Cleveland) led a panel of leaders, including Thomas E. Love, Ph.D., professor of medicine, epidemiology, and director of the Biostatistics and Evaluation Unit in the Center for Health Care Research & Policy at Case Western Reserve University at MetroHealth Medical Center (Cleveland); Greg Rosencrance, M.D., chairman of the Medicine Institute at the Cleveland Clinic; Jeffrey Sunshine, M.D. Ph.D., CMIO at University Hospitals (Cleveland) and a member of the faculty of Interventional & Diagnostic Neuroradiology; Rush Shah, product manager in the PCe Analytics Factory at the Charlotte-based Premier, Inc.; Patrick Mergler, director of cancer informatics at University Hospitals Seidman/Case Comprehensive Care Center; and Michael McQuaid, U.S. Healthcare Solutions, OnX Enterprise Solutions (Cleveland).
Issues around the management, stewardship, and governance of data and data processes ran through the panel session, often dominating the discussion. Early on, Premier's Shah put one of the core challenges very bluntly. "Almost everyone recognizes that every organization is awash in data," Shah said. "There are far too many excel spreadsheets floating around. And a big issue we have is master data management; as well as vocabulary harmonization. You have all this data in different systems that you have to harmonize. And you have to somehow enrich the data with metrics. But how are you doing your analysis of your data? Just describing facts and throwing your data together, into this junkyard of data, is not helping. you need to bring things together in a way that really makes for a story."
In fact, Shah went on to say, "You need to practice some kind of medical journalism around data that gives data its meaning. So I think," he said, "the drive needs to be in that direction. And the relevance comes with this idea of data stories. Are we describing our data in terms of stories that make sense and appeal to people?"
Mergler noted that "There is a lot of momentum in the federal government around the inclusion of patient-reported data being required. In the future, you're going to be required to find out how patients feel when they're at home--for example, their sharing with us their level of nausea, etc. And that will be a big step forward. I think that patient-reported outcomes data will be one of the most important advances in the next few years" in this area, he added.
Dr. Love said that inevitably, "There will be a lot of noise," as many new types of data are added to the mix. “What does a data point mean? What does a hemoglobin a1c or a blood pressure reading mean? It is about patterns in populations, as well as about individuals’ patterns. And in terms of the quality of the data, we can’t shy away from the questions, but we don’t have the answers now,” he said.
Where Will the Data Scientists Come From?
Reichert said, “Today, we are using a new term called ‘data scientists.’ If you talk to people about what a data scientist is, it’s someone who’s writing reports, but also doing some analytics behind that. So how do we acquire more data scientists, and do we have to grow them ourselves?” he asked.
“We face this challenge every single day of a big skill deficit,” Shah said. “We are trying to grow them in-house. We’re trying to get master’s degree graduates from data science, but they’re not that versed in healthcare. So the big question is, do we get a technology person and teach them healthcare, or vice-versa? I personally am biased towards getting a healthcare informaticist. And data science comes with a lot of jargon that is not understandable to a layperson. So we’re trying to actually teach them graphic design and communication. We want healthcare people, because they have a sensitivity to healthcare, to patient care. We have to humanize the data but at the same time be really sensitive to the nuances and the complexities, we cannot just ignore it. So I think it’s not just data science, business intelligence, or analytics, it’s more than that. And I am on the side of getting a healthcare person who’s interested in information science.”
Mergler said, “I agree with what Rush Shah has said. The problem,” he said, “is that the ones who are really good, we can’t afford. Even so, data scientists have actually been out there forever,” though often called other things, Merger added. “When I worked for Johnson & Johnson, we went to the gaming industry, the most advanced in terms of analytics. They’ve been doing it for decades. So they’re out there, but the problem is that they haven’t been coming to healthcare as they should be.” He noted some efforts in the Cleveland metro area to attract people to data science, data analytics, and healthcare informatics that have begun to bear fruit. “The reality,” he added, “is that we need to build a team that collaborates well and works across organizational barriers. That’s the way we have to be in that game. And it’s a competitive market.
McQuaid noted that “Some of the payers we work with are taking a blended approach. They’re taking those experienced data people they have and complementing them with people from outside the industry.”
Importantly, Love asserted, “In the future, being a nurse or being a doctor will involve using data analytics and data science. Banking on the idea that we can isolate data science and avoid it, as clinical providers, is not a logical assumption. It’s just like how we all need to type now,” because of the need to use computer keyboards, regardless of any other educational or professional preparation.
Tool Sets and Data Management
“There are a lot of tools out there, but I don’t want 20 different tools to manage my business. What will the evolution be?” Reichert asked his fellow panelists. “One of the curious things about data science is, it started with only SaaS”—software as a service—Shah said. “So I don’t think that data science suffers from fragmentation of tools as much as we think. But as we move to democratizing it, and diffusing it to all sorts of clinicians and practitioners, we need to make it easier. We had to use a tool with Carolinas Health Care, with Predixion [Software, Aliso Viejo, Calif.]. It’s going in the right direction, but is still not easy yet. So I think the problem of fragmentation is not that big.”
“I agree with Mr. Shah, but the other opportunity is looking at the tools,” Mergler said. “Most healthcare systems today do not have an enabling environment around the tools.” In fact, he said, “the true data scientists are working offline because the right tools are not available. Not many healthcare organizations have a virtual environment that has all the tools that data scientists need, and the plumbing is available to everyone through the EMR, etc. Getting the data, getting it de-identified, and other issues, are barriers. I do think that data security comes into play.”
Meanwhile, Mergler posited that even how data analytics will be pursued must change. “The traditional data science paradigm is, define the problem, figure out what you need, and I’ll give you the tool and get you the data,” he said. “But that paradigm is changing. “In fact, we need to create a kind of sandbox environment” in which data can be more freely played with and manipulated in order to achieve analytics breakthroughs, he said, “and that can be a scary thing. And I don’t think there’s any operational model yet that makes that available.”
In fact, there are significant data management challenges, Reichert said. “At Metro, we did an audit. And we found that we had 40,000 Microsoft Access databases in use!”
Given that potential for organizational chaos, Dr. Sunshine asked, “Who needs to be at the table to govern this? We need to figure out how to extract data, and then how to harmonize it. The best ontology still involves a bunch of human beings putting the right labels on humongous amounts of data,” he said. “And with all due respect to the science here, in terms of data governance, and data stewardship, we’re at the beginning of that.”
Dr. Rosencrance offered that “I think it’s more than just the governance of the data itself, it’s the systematic piece. This may be a more advanced piece of this,” he said, but there are really ‘five m’s’ of data’—measuring the data, merging the data, mining the data, managing the data, and maintaining the data. And when you look at those five domains, governance falls across all of those, and each of those handoffs or transitions in is fraught, and there’s an opportunity for data loss there. So looking at the framework, the how to manage and merge it, that’s a big challenge.”
Might there be some level of functional chaos ahead in all this, at least in the short term? “Democratizing anything is going to create chaos; democracy implies some chaos,” Shah said. “So if we talk about data governance, if we want a data-driven culture and want data to be constantly used, yes, 40,000 Access databases is a nightmare, but if you want people to really access and use data, you’ve got to create some kind of sandbox architecture, maybe through Hadoop,” he said. “So let’s recognize the inherent problem IT is facing now: as you democratize data and data decision-making, you have to hand over the keys. You will be become an infrastructure shop more and more, and all of these departments will decentralize, will have to decentralize.”
An additional issue, Shah said, is this: “If there’s something hospitals know very well, there are two kinds of data tribes: clinicians, and quality improvement/process people. And they are very, very different in how they look at data. And one of the ways in which those two tribes differ is in focusing on things like census rates and processes, versus whether we’re doing the right things for patients based on their diseases, right? So you’ve got very different tribes. And when we think about data governance, we have to think about those two different data tribes. I meet so many people and it’s amazing how differently doctors talk from how patient flow/process improvement guys talk. It’s two different worlds, and it’s fascinating. And they do understand each other. But data governance is going to be really critical to all this. And IT people had better get prepared to democratize.”
Mergler noted that “Gartner did a report recently on the failure of data warehousing, and it cited an 85-percent failure rate. And it listed 10 top reasons for failure, and the top six all related to that you cannot have a successful data strategy if you think the answer is, give it all to IT, build a wall, expect to wait months and months for the answer. So what Mr. Shah is saying is absolutely correct: so data governance is about democratizing the data and processes. You can’t do data governance only through tight walls and sanctioned processes, you just won’t get there. No way.”