HP's Top 10 Trends in BI (and HIT) for 2009: #10 Turning the DWH on its Head - Persistent Queries, Fleeting Data
Sometimes vendors do get it (mostly) right. Hewlett-Packard put together a brief white paper in February of this year laying out their view of Business Intelligence (BI) for 2009 (and beyond). I think that they got it largely right. Their #10 trend notes the increasing integration of Complex Event Processing (CEP) engines into traditional data warehouse (DWH) and BI platforms. Below is a summary of the trend, my thoughts on whether HP got it right and what the trend may mean for HIT.
HP Predicts: As with HP's Trend #9, not so much of a prediction as an observation. A little background is necessary – for those of you who know this story in detail, forgive my ellipses and simplifications.
Historically, transactional and operational systems were about fleeting data and fleeting queries, that is, they were optimized to carry out individual transactions at very high rates and with very high fidelity. This posed a problem to management when they wanted to look for patterns in the transactional/operational data and so the DWH was born. In the DWH, data was captured, organized and archived (i.e. it was “warehoused”) in order to make the data persistent with the expectation that queries would be fleeting – subject to the whim of management and the acumen of the analyst. As managers and analysts began to regularly run the same reports over and over, looking for exceptions as drivers of cost and profit, queries began to persist. These persisting queries put ever increasing burdens on the DWH and so BI was born. BI leveraged the persistence and structure of the data in the DWH but offloaded reporting and analysis. BI then was about mixed workloads of persistent (periodic reporting) and fleeting (drill-down on exceptions) queries against persistent data (the DWH).
All was well in the world of DWH/BI and then data volumes began to grow exponentially and the traditional ETL tools and architectures to move data from the transactional and operational systems into the DWH began to groan under the weight of gigabytes and terabytes of data per day. What’s more, line managers and even operators began to ask for access to the DWH in order to make better (near) real-time decisions. To meet these new needs, Operational BI (OBI) was born. In OBI, the queries are still mixed (hence the BI), but the data is fleeting. The thing with OBI is that it still requires a human to consume a report and make a decision, and it is therefore bandwidth limited, especially for automatable decisions and insights. At last we come to CEP!
In CEP, the queries are persistent, they are algorithms proactively scanning fleeting, (near) real-time data, looking for an exception or connection from which to trigger an action. There are no fleeting queries in CEP, so there is no mixed workload and the data is pulled right off of the transactional and operational systems (often with baseline, benchmark or reference data from the DWH systems). In short, as stated in the title, DWH is fleeting queries and persistent data, while CEP is persistent queries and fleeting data.
The Verdict: Yes, RDBMS, Data Warehouse Appliance and BI vendors are purchasing and/or partnering with CEP vendors; Yes, CEP is a natural complement of OBI; Yes, CEP is trending upwards.
HIT Impact: From a clinical standpoint, virtually nil. I don’t believe that Evidence-Based Medicine and Comparative Effectiveness Research are robust enough yet for a vendor or clinical IT department to leap into the field of automated clinical decision-making on the basis of transactional feeds (near) real-time integrated with DWH feeds. Mind you, there are plenty of medical devices which do (essentially) this, but these devices are FDA-approved for very specific purposes and rarely (if ever) integrate data across multiple feeds from multiple vendors. Remember also that Clinical Decision Support (CDS) is something totally different than Clinical CEP. CDS makes suggestions and requires a human for sign-off; Clinical CEP would take action without human approval or even awareness.
From operational, data governance, compliance and surveillance standpoints, potentially quite high. Consider a (near) real-time system that rationalizes the transactional records of your Master Patient Index, Computerized Provider Order Entry, Surgical, Hospital Billing, Provider Billing, Pharmacy/Laboratory, PACS/RIS, Emergency Department Information System and measured against historical and industry benchmarks in your Clinical/Enterprise DWH, while the patient is in the operating room, to catch and correctly link orphaned records (especially for “always” parallels between charges and transactions such as specimen type “blood” and venipuncture); to continuously assess pay-for-performance, meaningful use and quality & safety metrics against individual, departmental, hospital, system, state and national benchmarks and automatically initiate peer review and morbidity & mortality processes when appropriate; to identify rare complications or interactions whether of drugs, devices or procedures, statistical clusters or seasonal trends and initiate appropriate investigative protocols.
Disclaimer: The opinions expressed herein are my own personal opinions and do not represent my employer's view in any way.