Data Vault is getting more and more popular for modeling Data Warehouses. Some of my colleagues asked me for book recommendations about this modeling method. Here a short review (from my personal point of view) of two Data Vault standard books.
Modeling the Agile Data Warehouse with Data Vault
This book of Hans Hultgren helped me to understand the concepts of Data Vault modeling and how to design a data model with Hubs, Links and Satellites. It gives a good overview of the core constructs of a Data Vault model and some specialities to be considered during the design of a Data Warehouse with Data Vault.
After an introduction of Ensemble modeling methods in general, the book explains the core constructs of Data Vault modeling in detail and how Hubs, Links and Satellites have to be designed. The modeling concepts are described with simple, but practical examples. Additional design topics about each type of Data Vault table are explained in further chapters, including design recommendations and advanced concepts.
The main focus of the book is data modeling and Data Warehouse architecture. Other topics such as data integration, loading data into and from Data Vault tables are mentioned, but the explanations of these subjects are not very detailed.
From my point of view, the book of Hans Hultgren is a very good introduction into the modeling concepts of Data Vault, very informative and easy to read. The concepts are clearly explained, and the examples are descriptive. The only lack is the missing index which makes it harder to find specific topics in the (printed) book. Of course, this is only an issue for the printed version, not for the eBook edition.
Building a Scalable Data Warehouse with Data Vault 2.0
To be honest, I was not very excited about the previous books of Dan Linstedt. They were quite hard for me to read. But his newest book that he wrote together with Michael Olschimke is very practical and contains a lot of useful implementation details.
The book starts with a good introduction to Data Warehousing and DWH architecture. Then it explains the methodology and modeling rules of Data Vault 2.0. Very useful are the additional chapters about intermediate and advanced Data Vault modeling topics.
The most interesting topics for me in this book are the chapters about loading Data Vault, implementing data quality and loading dimensional Information Marts. These chapters contain many practical implementation details and SQL examples that show how a Data Vault model can be loaded and used as a source for dimensional data models. The examples are implemented for SQL Server, but can easily be adapted to other database technologies (in my case, to Oracle).
Some of the chapters and examples with many screenshots I skipped, because they are very specific related to Microsoft tools that I don’t use. But nevertheless I recommend the book to everybody interested in how to implement Data Vault models and the corresponding load processes based on practical examples.
Which Book Should You Buy?
So, which of the books is better? As a typical consultant, my answer is of course: “It depends”. If you are mainly interested in design topics of a Data Vault model, the book of Hans Hultgren is definitely a good choice to learn and understand the concepts of Data Vault. If you want to know more details about the implementation of a Data Vault model and the corresponding ETL processes, I recommend the book of Dan Linstedt and Michael Olschimke because of the useful implementation examples.
If you are interested in all topics about Data Vault, do it like me and buy both books. They are a very good combination.