In the last three months I wrote some articles for Trivadis about miscellaneous design patterns for Data Warehouses. All the articles are based on practice experience of several customer projects. Because most of our customers are in Switzerland, Germany and Austria, the articles are in German. If you don’t know German, the articles are probably hard to read – unless you use translate.google.com 😉
Index Strategy in Oracle Data Warehouses
In November, I had the opportunity to give a speech at the DOAG conference in Nürnberg. The article Indexierungsstrategie im Data Warehouse: Zwischen Albtraum und optimaler Performance is a summary of this presentation and explains how Data Warehouses in Oracle should be indexed. The main message is that often too many indexes are created on a DWH database. In many cases, no indexes except those for primary and unique key constraints are needed in Staging Area, Cleansing Area and Core. For the Data Marts, it is recommended to create bitmap indexes on the dimension keys of the fact tables. More detailed information can be found in the article.
Early Arriving Facts
A typical problem in the ETL processes is when facts are delivered from a source system and have to be loaded into the Data Warehouse, but the related dimension data is not yet available. The article Wenn die Fakten zu früh eintreffen describes three approaches how this problem can be solved. I wrote this text in December, but the idea for the article I had some months ago when I wrote a concept about ETL error handling for a customer. Early arriving facts are one of the reasons for this customer that the load jobs often stop during the night. With the design patterns described in the article (and in the concept for the customer), these load failures can be avoided.
Top-down and Bottom-Up Data Modeling
In January I wrote an article about the different ways to design data models for Core and Data Marts. This can be done based on the user requirements for the Data Marts (top-down) or based on the available data of the source systems (bottom-up). In the article Welche Daten gehören ins Data Warehouse? both approaches are explained, including the pros and cons of each way to create a data model. I have seen many DWH projects where the bottom-up method was used, but I think, in most cases a top-down approach would have been more useful. In some situations, a combination of top-down and bottom-up can also be feasible.