In our book Data Warehousing mit Oracle – Business Intelligence in der Praxis we describe a Data Warehouse architecture with four layers. Here is a short overview of these layers.
- The Staging Area contains a copy of the last data delivery of each source system and is mainly used to reduce the time window for extracting data of the OLTP systems. The tables in the staging area have no relationships, and data from different sources are not integrated yet in this layer of the Data Warehouse.
- The Cleansing Area is used for data cleansing, filtering wrong data, replace missing values with singletons and performing transformations like code lookups or currency conversions. As the Staging Area, the Cleansing Area contains only data of the last delivery, and data from different sources is not integrated.
- The Core has two purposes: First, it is the integration layer for the data from all source systems of the Data Warehouse. The data in the Core is not stored separately for each source like in the previous layers, but stored in a “subject-oriented” form. The second purpose is to store the history of all data changes in a form that allows to retrieve previous states of the data at any time in the past. This is an important requirement for traceability of data changes, and there are different approaches to store historic data.
- The Data Marts provide the data in an adequate form that allows easy access for the front-end applications of the BI platform. Typically, the Data Marts are designed using a dimensional data model and either implemented as a star or snowflake schema or as a multidimensional cube. For each type of application and user group, a separate Data Mart should be designed. An “all purpose” Data Mart that supports all user requirements is not recommended.
This DWH architecture is used as a blueprint and reference architecture at Trivadis. But how fix is such an architecture? Do we have to decline a customer project if the Data Warehouse system of the customer contains no Core or if the Cleansing Area is missing? Or do we have to begin from scratch when the architecture does not fit our conventions?
Of course not! The only rule that is always true says: There is no rule without exceptions. In all the years I work for Trivadis as a DWH consultant, I have seen many Data Warehouses – and all kind of DWH architectures. In discussions with my customers and collegues about the best architecture of a Data Warehouse, I learnt the only perfect answer: “It depends”. Every consultant knows this answer.
In my blog I will write about some examples of real Data Warehouse projects. Some of them are near to the reference architecture described above, others use different approaches. But even if real projects are usually more complex than the examples in our DWH book, I still recommend to have our reference architecture in mind when building a new Data Warehouse.