Blog
Enterprise Data Warehouse Solutions – Get the Data Basics Right, Part 2: Data Infrastructure
For the past couple of weeks, we’ve been discussing “Getting the Data Basics Right.” In last week’s blog, we covered the importance of People, Priorities and Processes as the groundwork for a future-ready data foundation, based on our most recent thought-leadership report, Digital Insurance 2.0: Building Your Future on a Robust Data Foundation. This week, we are looking at the core of the data ecosystem — the enterprise data warehouse.
One of the biggest stumbling blocks for past data initiatives (and a key reason for failure) has been the lack of a robust, implementable enterprise data warehouse solution and underlying data model. Many prior EDW projects either attempted to build a complete EDW the first time, or implement one over several iterations to align with their core and other key systems as well as different sources of data. This created an elongated, expensive and often less than effective enterprise data solution that rarely reached the desired goal. The extensive and constant rework resulted in many carriers scaling down their EDW to manage costs, but this also resulted in lots of one-off data mart requests that did not provide a foundational enterprise solution. This is why there is such a huge need for a robust and proven EDW that can efficiently capture and integrate data from internal sources first (and from external sources later), and also provides the map to link the data to the business.
As has been mentioned and reported in many places, carriers are starting to look at new sources of data, both internal and external. However, they are finding it difficult to integrate these sources, except in small data sets. It is very hard to bring additional data together to help the business, when carriers have not built the foundation on which to integrate them. They struggle to achieve both depth and breadth of enterprise data coverage. They can get years of historical data across a smaller number of siloed data sets or current data from numerous newer, disparate data sets, but usually they cannot get both, due to data quality issues and data semantics. Data lakes make it very easy to ingest large volumes, varieties and formats of data, but they do not solve the fundamental problem of how to integrate data (semantically and consistently) from within and outside the enterprise. Data lakes only move the hurdle further down the data flow process, they do not remove it. Thus, carriers struggle with the simpler question, “How do I get an enterprise view with the data I have before I start tackling the harder problems of blending it with unstructured data and external data?” As with many complex and long journeys, one must learn to crawl before one can walk and then later run and finally sprint. Majesco research has found that many carriers are realizing this and returning to making an enterprise data warehouse (EDW) a top priority.
The most important part of any EDW is the underlying data model. A number of models exist today. Some are very extensive, but hard for many carriers to implement due to their lack of expertise and complexity of evaluating all their data needs. One of the biggest lessons learned by carriers in the early attempts to create EDWs is that it is not an easy or straightforward process. Creating a model that is flexible, extensible and robust requires years of trial and error. It is a process that is constantly evolving and changing, which makes early implementations, updates and rework very costly.
There are key decisions to make when creating an enterprise data warehouse that blend coverage, extensibility and analytics. Some of these key architectural decisions are focused on the highest level of the policy and how to connect policies across systems and that are related to each other (i.e., umbrella policies); linking policy, billing, claims and distribution management transactions and data and, more recently, party data and related information. Only through years of experience can a carrier or vendor create a robust data model that can be a strong foundation to an enterprise data warehouse.
The Majesco Enterprise Data Warehouse (EDW) solution and Data Model is the result of 20 years of building operational data stores, reporting databases and enterprise data warehouses for carriers of all sizes and lines of business. The underlying data model has matured based on pragmatic business needs and analytic data requirements. It is flexible to handle data granularity down to the lowest level (i.e., premium data at the limits, deductible, mods and factors); for example, it can acquire data at the same level captured for reserving in core administrative systems.
The Majesco EDW is based on the enterprise data model and comprised of independent, but integrated components including Policy, Billing, Claims and Distribution Management. Each component can be (and is) used today by some insurers as an Operational Data Store for reporting or data feeds into a data lake. Each component contains pertinent policy data that is required by the other core administration solutions when decisions are made. For example, a claims system needs to know the policy number, the risk, the coverage and the party on the policy, as well as the effective date and terminal date. Without this information, it cannot be ascertained if the claim is valid or not.
The real value of the Majesco EDW is that it goes beyond most insurance EDWs in its ability to integrate and connect the data across the 4 components to give an insurer the best summary of the enterprise data. It shares party information, which includes not only the insured and related parties (secondary drivers, mortgagee), but also internal parties (underwriters, CSRs, approvers, etc.) and external parties (attorneys, third party adjusters, claimants, etc.). In addition, the Majesco EDW data model (and, thus, the Majesco EDW) can store authorization information, litigation information (both 1 to 1 and 1 to many, such as Chinese drywall law suits), CAT information (both PCS and internally defined CATs), and much more.
The Majesco EDW provides a carrier with full coverage of almost all of their structured data in their core administration systems. The data is stored in a Third Normal Form (3NF) to reflect the transactional nature of the original transactions. The Majesco EDW works well for feeding a carrier’s data lake, downstream systems or operational reporting. It comes with a self-service Business Intelligence (BI) front-end with out of the box (OOTB) dashboards and reports. However, reports directly from a 3NF data warehouse are best if they are listing reports or predefined reports that allow for filtering. Majesco Business Analytics (MBA) enables business users to do root cause analysis by slicing and dicing their data across policy and claims. Once a carrier has a solid foundation and visibility into their core systems data and operations, it can then load data from additional sources to gain a deeper understanding. For example, loading industry data allows a carrier to compare its performance, costs and losses against its competitors. Claims notes are loaded with valuable data, albeit in an unstructured format. Text mining tools exist that can take the claims notes (or other unstructured data), process it and provide the additional information to help with claims fraud, building overages, litigation decisions, etc. Many data service vendors are now providing this and other supplemental data in the insurance data ecosystem that can be stored in the EDW, but without a strong EDW foundation, carriers basically have to “duct tape” the data together, which makes it fragile and not very reusable.
A robust EDW foundation sets the stage for the next steps for a data-driven insurer, namely analytics. In our next blog, we will complete the story on “Get the Basics Right, Part 3: Analytics Infrastructure.” For an in-depth preview, download and read Digital Insurance 2.0: Building Your Future on a Robust Data Foundation.