This past June, Iowa was hit some of the worst flooding in the state’s history, and right in the thick of it was the 370-bed Mercy Medical Center. More than 4,000 homes in the area had to be evacuated, and the hospital was forced to move 176 patients to nearby facilities. However, despite the fact that water levels rose so high that sandbags had to be piled up outside the doors, physicians were never left in the dark, as the facility’s network, EMR and communication systems stayed up during the entire ordeal. For Jeff Cash, Mercy’s vice president and CIO, it was the ultimate test for his staff’s preparedness.
KH: As far as the patient records, were clinicians able to access EMRs the entire time?
JC: Absolutely. We have really two primary systems that house our EMR. We use the Meditech Magic system, and we have two of those as well — one running in our primary data center and a back-up one available in our secondary data center. So in the primary data center, which wasn’t really affected by the flood at all, we continued to operate Meditech. We decided to shut it down at one point for a brief period of about 18 hours after we had evacuated our patients, and that was for the internal records we use. That was just a precautionary measure; we were concerned that if for some reason our generator power couldn’t be distributed that we’d have a big problem.
What we did to remediate that was, we had several contractors come in during the flood and we pulled generator feeds directly through the first floor of the hospital and we had a permanent done installation between our generators and our primary data center that completely bypassed all the electrical switches in the ground floor and below. So we were able to pipe generator power directly in almost through an overhead system into the primary data center. With that, we were comfortable we would not lose electrical power again in the foreseeable future, so then we started bringing all of those systems back up.
KH: Now that you’ve been through a disaster situation, how prepared would you say that Mercy was for such an event? Did you have the necessary steps in place?
JC: We have a business continuity plan that we had used to create most of the planning for what we build, so I think that helped us prepare for a vast majority of relocations and further redundancy that we had. We have tested some of that redundancy in the past, especially things like communication systems and call manager; we’ve run tests on that to fail them over. We’ve brought up our secondary Meditech system in the past, so we’ve been able to test that as well even though we didn’t have to use it this time.
One of the things we’ve planned is to make our data centers essentially as portable as possible, with the intention that if something like this came up or should we need to move a data center at some point in the future, that we’d be able to do that. Our data centers have been in a modular fashion, so we’ve tried to keep all of our cabinets self-contained in the sense that we’ve moved to a blade server technology infrastructure. So we’ve been replacing at a very rapid rate, over the last couple of years, all of our traditional servers with blade servers, so that makes it a lot easier to have a smaller number of cabinets that we would have to move if we had to be portable.
We moved to a fiber-based architecture for all of our network switching, and we’ve put in a large storage area network that’s redundant between both of our data center as well. We use two big HP EVA stands and we have business copy between them and between our data centers. The idea is you had to move a cabinet or if you had to move an entire data center, it should be as easy as pulling the power off of the cabinet, pulling the network connections off of the cabinet, and being able to move it to an alternate location and plugging it back in where you have a network connection and being back online.
I think that helped us with the ability to move out of the data center. It was a little bit longer evacuation of our second data center, so what we ended up choosing to do was remain in our primary data center, and we took an opportunity to expand an rebuild and update our second data center significantly. We did end up doubling up on our primary data center for a period of time longer than you might have traditionally expected due to the evacuation.
We were able to do that with those modular building blocks essentially, not having a PVX to tie us down and not having all the traditional copper cabling to tie us down. By keeping a high concentration of servers in a single cabinet, it’s much easier to move them on a portable basis.
KH: Does it become a challenge in prioritizing what systems will stay up during an emergency?
JC: We have tried to create, as much as is reasonable for a hospital our size, a plan to help deal with that. We have gone through the business continuity planning process to decide what are our critical systems, and we’ve tried to keep two of those systems available in the hospital to the extent we can. Those that will run in a hot, standby redundancy program, like the Cisco call manager, and we have two of them with automatic hot failover split between our data centers. Those systems that don’t lend themselves toward hot failover like that, but we can have running in a hot standby mode that we can manually convert over to, we’ve done that as well.
As an example, we have a second Meditech installation at our other data center that we can failover to if we needed to. The application itself, however, does not fail over automatically, so there’s manual intervention required with that.
And on those devices where we don’t feel as though we need the critical level of redundancy to have two in house, we try to either put them on a VMware platform, to make them transportable and easy to migrate to another hardware platform should we need to relocate them. Let’s say for example we lost one of our data centers and we didn’t have full redundancy with all of our hardware; with VMware, it tends to be hardware-agnostic, and we could reload the VMware portion of it and bring the applications back up on a new platform that could be current, or new hardware that maybe we haven’t used before. As opposed to the old way, where the operating system was built specifically for a set of hardware, and in a disaster it’s hard to find that exact same configuration again. So we’re trying to move as much as we can toward VMware as well. I don’t know exactly how many servers we have in VMware today, but I think it’s probably upward of 70 or so that are running in VMware, which makes them very fall-tolerant in some cases, and it makes them easily restorable in other cases.
KH: I think that’s a trend we’re going to start seeing more as disaster preparedness becomes a bigger priority. A hospital certainly doesn’t have to be in a hurricane-prone area to want to be ready for an emergency. Even with your own facility — Mercy was impacted by a flood when it’s located 10 blocks from the river. You just never know what to expect, right?
JC: Definitely. We kind of always thought we might get hit with a tornado, which is very real in our part of the country. Or, we thought, we’re not too far away from what’s called the San Madres Fault, so there’s always an opportunity we could pick up a small earthquake — not big enough to do much damage, but enough to shake things up a bit. You never know what that could do to your IT equipment. But we sure never predicted that we’d have a flood here.
The reason we sandwiched our second data center between the first floor and the basement was specifically so that it was below-grade, so that a tornado wouldn’t take it, but it was also above the basement so that flooding wouldn’t take it out either. That was good for us because we were able to save the data center. But because everything around it was destroyed, we went ahead and evacuated it, and just took that opportunity to rebuild it and make it bigger and better.
KH: It’s great that you were able to take the opportunity to improve the other data center, because as I’m sure this experience has shown, you just never know what can happen.
JC: No, you don’t. I think you have to take the natural disaster component out of it and know that any disaster, whatever it is, may cause you to either evacuate or lose a portion of your data center for a period of time, and you have to know how to prepare yourself for that. You can do that independent of what the natural disaster is.
KH: How important is it that there are steps in place for what to do in a situation like a flood?
JC: It’s absolutely critical. For me, as the CIO, it’s one of those things that’s a once in a lifetime opportunity, you hope, to test what you’ve done and see if all is going to go well. It just so happened I was in Europe when this happened. I was over in Europe with my family on a three-week vacation and I got a call at about 4 a.m. saying that they had evacuated the hospital. We were able to get back within a few days to help with all the clean-up and putting things back together, but I wasn’t here for the flood.
But the fact that my team had enough equipment, enough leadership and enough planning in place to be able to get through this without losing access to any of our critical systems on an unplanned basis was a good testament to the planning we had done. It made me feel good to know that they’re capable of doing that even without me here. There’s no better way to test your leadership skills than to go through this and not be here.
Another thing we learned during that program is there are two external communication sources.
What we learned is that we had made the decision to host our own website here onsite a number of years ago because we own the security and the Internet access and we thought we had plenty of redundancy to handle that. There was a period of time, however, when Internet services were lost to us by our provider, so we did not have external communications through the Internet. However, we had a website provider that had originally developed the website for us, and we had just created a contract to have them continue hosting for us so to be able to migrate back to their hosting facility. We used that functionality to move our website to their site and bring it back up so that we had a consumer and an employee communication tool still available to us, even if we didn’t have Internet here at the hospital. They were able to just take a copy of what we already had in place for our website and bring it back within a matter of a few hours and republish and repoint our URL to their site, and they became the new hosting agency for our website. The consumer information was extremely important, because we had close to 1,000 volunteers come here to help save the hospital through sandbagging efforts. We had vendors that came in and actually just took on teams with their own leadership to handle plumbing or facilities or electrical. They all just became part of a hospital team of their own initiative.
So we had to be able to communicate to them and to our employees about the status of what was going on, and the easiest way to do that was through our public website. We’ve migrated that to this new hosting facility and I think we’ll continue to leave it there for the foreseeable future.
KH: Going back to the electronic records, how were you able to ensure that clinicians had access even as staff and patients were being transported?
Part III Coming Soon