Regardless of your RDBMS you’ll find more than a handful of scripts online to add a Date Dimension or Calendar Table to your schema. One problem I’ve seen is they lack an every-other-week column. So, let’s explore a common SQL Server script and how to add an Every Other Week flag to our date dimension.
This method is written for Microsoft’s SQL Server, but it is ANSI standard and will work with any date dimension which has an integer day-of-week column, which I’ve never seen one that doesn’t…
The topic of dynamic dimension descriptions isn’t new, but the method for which I’ve come up with is a hybrid of several other methodologies, making for a great subject piece. Using codes and descriptions in your warehouse dimensions is standard practice, even in some cases the use-case may require displaying a code versus a description, for example: displaying USA instead of United States of America, or even in situations where the business understands what a “TPS” report type is more so than the formal “Testing Procedure Specification” description. In any case, you’ll want a method that enables the ability to add and modify additional codes and their descriptions without having to perform risky manual updates to the warehouse.
In this post, I will discuss a rigid yet tolerant way for properly implementing dynamic dimension descriptions without ever directly modifying a warehouse dimensional table. Instead, we’ll implement a lookup table that an end user can insert and update freely, along with a robust ETL process that uses this same lookup table to perform description updates and even type 2 historical tracking, if necessary.
Dynamic Dimension Descriptions Using An ETL Lookup Table
Referential Integrity in the Data Warehouse is a controversial topic amongst BI Professionals; you’re either all for it or all against it. While keeping the focus on RI, solely with regard to the Data Warehouse, I want to discuss some pros and cons to help you make a better, more informed decision that will have long-lasting impacts to your warehouse, development lifecycle, and production maintenance.
You need to understand that referential integrity is a decision, not a standard. Blindly implementing RI because you’re “supposed to” is a presumptuous, neglectful response to a complex problem. I think that everyone will agree that leaving constraints at the application layer is an apprentice mistake, nevertheless, when it comes to data it may be our best choice. Yet, still, having table-driven constraints is always preferred — if it makes sense!
When does it make sense to have Referential Integrity?
I recently ran into an interesting problem that I’d like to share and show how I resolved it. The solution involves a catch-all join to a lookup dimension table.
Imagine having many employees that work in many departments. Each department has their own way of determining the employee’s status; Some departments use the status code that was given in the source system, other departments rely solely on the department they’re from and others use a combination of both! Oh yeah, the fun bit, this status logic can change…