Philadelphia-based Penn Medicine has invested many years and millions of dollars in electronic health record infrastructure to capture data at the point of care. More than seven years ago it also started building a large clinical data warehouse that now holds records on 3 million patients going back 10 years.
That investment is now reaping benefits as Penn data scientists use the data to fine-tune algorithms and predictive logic that could identify adverse events such as the onset of sepsis. On July 14, I had a chance to interview Brian Wells, Penn Medicine’s associate vice president of health technology and academic computing, and Michael Draugelis, chief data scientist, about “Penn Signals,” a platform that provides the tools needed to build, test and deploy predictive applications powered by Penn’s EHR data stream.
“We can run an algorithm against real-time data coming out of the EMR and do predictions almost at the point of care,” said Wells. “That is the foundation we have laid. Because of that investment in having the data organized, aggregated, and mapped in a way that is very usable, it enables our team to do some things that are pretty amazing.”
In fact, Draugelis, who came to Penn Medicine after a career as chief data scientist at Lockheed Martin, said he was attracted to Penn Medicine because of the great job they have done curating the data. (Penn Medicine is a $4.3 billion organization with more than 2,000 physicians providing services to the Hospital of the University of Pennsylvania, Penn Presbyterian Medical Center, Pennsylvania Hospital, Chester County Hospital and a health network that serves the city of Philadelphia, the surrounding five-county area and parts of southern New Jersey.)
Penn also bought into the vision of a small data science team that could build a framework called Penn Signals that taps into years of retrospective data, as well as real-time data to develop these algorithms in a way that allows insights to be sent into operational channels for clinicians, Draugelis said.
“We embed the data scientist with the clinical team to figure out what the experts are doing right now and that is going to point us in the right direction,” he explained. For instance, with the sepsis algorithm, Penn Medicine already had a system called early warning system 1.0, and the clinical team developed a decision tree with seven clinical variables that did quite well, he said. “We started there, but then we pulled thousands of variables together — all the vitals, labs and medications, etc., and Penn Signals puts this into a real-time matrix that we can apply these algorithms to in order to make predictions,” he explained. The result is a snapshot, not just of one moment, but of a trend in time. “How these hundreds of variables interact over a period of time creates a tapestry and that is what we are looking to recognize,” he explained.
“What we find is that, of course, the variables the clinicians looked at are important in forecasting these events, but we finds tens of others that are important,” Draugelis said. “Two things happen: we get a more powerful forecasting algorithm, and we have deployed that. Secondly, we create points of research, where we can say there are other variables that are showing strong forecasting power for the onset of severe sepsis. Why is that? That is a very important focus at a university hospital, where it sets up points for more clinical research to understand those things.”
The fact that the data scientists are embedded with the clinical team instead of just providing some black-box solution is critical, he stressed. “Some of the variables are red herrings,” Draugelis said. “We might be measuring the care processes and not the illness. We don’t want to use those. As much as we as data scientists try to protect from that, clinicians can segregate those out pretty quickly. With these data models it is easy to do the sub-optimal thing and go off on a dirt road that is dangerous,” he added. “To get the correct solution for the patient, it is really important to be connected to the care team along the way and have this cycle of iteration with their pathway.”
Penn Medicine has a pipeline of algorithms in the works. The first was sepsis; another is around heart failure and risk stratification. The heart failure team is working to define the Penn pathway for heart failure, to determine how they can catch signals in the very beginning before official diagnosis to say this person could benefit from advanced care (or not).
Draugelis said a key focus of his team has been to reduce pain points in developing the algorithms and make them accessible for health systems to deploy, so they can focus less on the technology. “Any time you produce these new pieces of information that never existed before, you need to do a redesign of your [care] pathway, and that is where the hard work is,” he said. “We did this completely with open source technology and our plan is to share it as much as we can with other institutions so that they can take advantage of these things.”
Other Things in Progress at Penn Medicine
Besides the clinical data warehouse, Penn Medicine also has created a research data warehouse dubbed “PennOmics,” which provides a centralized location for fully de-identified clinical data, replacing research data silos around campus. I asked Brian Wells for a quick update on PennOmics and other developments at Penn Medicine.
People are discovering PennOmics allows them to find things they weren’t able to find before, he said. A cardiologist with a grant-sponsored study requested data on about 5,000 patients. Later, researchers in ophthalmology were able to mine that same data set looking for patients with markers for glaucoma. “So we are able to leverage the investments we make in sequencing for other research, which is a big plus, now that the data is all in the same area,” Wells said.
• Populating the Oracle Health Sciences Network: Penn Medicine was the second Oracle customer to sign up for the cloud-based service that gives its partners the capability to query Penn Medicine’s de-identified data and find populations to meet their needs. “We have just signed our first contract with a local clinical research organization that is beginning to get trained and run queries,” Wells said. “We are working with them to refine that tool,” he said. One challenge is that a lot of the data used for inclusion or exclusion from a trial is in unstructured text, so Penn Medicine is evaluating natural language processing (NLP) tools to mine that data. “The next big area of data extraction and mining is NLP-based,” he said.
• Automating biobank workflow: “We have hundreds of thousands of samples downloaded in a vendor-provided laboratory information system that can do sample tracking and sample inventory management,” Wells said.
• Linking clinical trial data to Epic EHR: Penn has just agreed to expand its existing clinical trial management system from just the cancer center to enterprise-wide and it is going to begin a project to feed EHR data into that tool, to make enrollment and patient status management easier, faster and more accurate. Integrating that system more closely with the Epic system will take longer, he said. “One of the benefits of integrating trial management with the EMR is to do EMR-based research billing reconciliation to track all the charges on a given trial and make sure the patient’s insurance doesn’t get charged for a service that was research-oriented,” he said.
• Clinical decision support for precision medicine: “We are working with Epic to try to educate them as to how genetic data in a discrete form could flow into their system and set triggers or flags as to each genetic variation the patient might have,” he said. “So for instance, we could write rules or alerts that pop up for the provider that this patient is a slow metabolizer of this drug and you might want to prescribe drug X, not drug Y. We are not as far along as I would like, but we are talking to all the relevant parties and encouraging them to improve their systems to make that possible.”
Patient-generated data: Penn Medicine has just started working with Apple and Epic on remote patient data capture. Its first trial in that area kicked off several weeks ago. “We are enrolling 20 postpartum women in a blood pressure-tracking study,” Wells explained. They are going to be taking their blood pressure once a day or more with a Bluetooth cuff that links to their iPhone, which then feeds the data into Penn Medicine’s EHR within three seconds or less. “So far, so good,” Wells said. “We are looking forward to collecting enough information so we can provide some feedback about feasibility of it, the technical challenges, and the validity of the data.”