I recently ran into an interesting problem that I’d like to share and show how I resolved it. The solution involves a catch-all join to a lookup dimension table.

Imagine having many employees that work in many departments. Each department has their own way of determining the employee’s status; Some departments use the status code that was given in the source system, other departments rely solely on the department they’re from and others use a combination of both! Oh yeah, the fun bit, this status logic can change…
How to solve with a catch-all join to a lookup table?
The first step to solving this problem is to create a lookup dimension that will be used by the presentation layer and possibly an ETL process to retrieve the proper status. This table will store the key(s) and value(s) pairs required to locate a particular status. A zero-to-many to zero-to-many relationship between the existing employee table and the new lookup table will exist.
Let’s insert some test data to play with:
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 |
CREATE TABLE DIM_EMPLOYEE ( EMPLOYEE_NAME VARCHAR(3), DEPARTMENT_NUMBER VARCHAR(1), STATUS_CODE VARCHAR(1) ); INSERT INTO DIM_EMPLOYEE VALUES ('Bob', '3', 'N'); INSERT INTO DIM_EMPLOYEE VALUES ('Tom', '4', 'N'); INSERT INTO DIM_EMPLOYEE VALUES ('Joe', '3', 'X'); INSERT INTO DIM_EMPLOYEE VALUES ('Dan', '4', 'X'); INSERT INTO DIM_EMPLOYEE VALUES ('Ian', '1', 'C'); INSERT INTO DIM_EMPLOYEE VALUES ('Max', '6', 'A'); INSERT INTO DIM_EMPLOYEE VALUES ('Jay', '6', 'Z'); CREATE TABLE DIM_STATUS_LOOKUP ( DEPARTMENT_NUMBER VARCHAR(1), STATUS_CODE VARCHAR(1), STATUS VARCHAR(10) ); INSERT INTO DIM_STATUS_LOOKUP VALUES ('3', 'N', 'Active'); INSERT INTO DIM_STATUS_LOOKUP VALUES ('4', '*', 'Active'); INSERT INTO DIM_STATUS_LOOKUP VALUES ('3', 'X', 'Deactive'); INSERT INTO DIM_STATUS_LOOKUP VALUES ('*', 'A', 'Inactive'); INSERT INTO DIM_STATUS_LOOKUP VALUES ('*', 'C', 'Unkonwn'); |
Notice that I’m using an asterisk (*) to denote a wildcard, catch-all value, meaning anything can match. So, in this example, Max will match to an Inactive status because his status code is an ‘A’, and the rule states that no matter which department an employee is employed if the employee has a status of ‘A’, they are considered inactive.
How would a join work between these two tables?
Let’s take a look!
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 |
SELECT E.EMPLOYEE_NAME ,DSL.STATUS FROM DIM_EMPLOYEE AS E LEFT JOIN DIM_STATUS_LOOKUP AS DSL ON DSL.DEPARTMENT_NUMBER = (CASE WHEN DSL.DEPARTMENT_NUMBER = '*' THEN DSL.DEPARTMENT_NUMBER ELSE E.DEPARTMENT_NUMBER END) AND DSL.STATUS_CODE = (CASE WHEN DSL.STATUS_CODE = '*' THEN DSL.STATUS_CODE ELSE E.STATUS_CODE END) ORDER BY 1; /* Sample Output */ /***************************** * EMPLOYEE_NAME | STATUS * *---------------------------* * Bob | Active * * Dan | Active * * Ian | Unkonwn * * Jay | (null) * * Joe | Deactive * * Max | Inactive * * Tom | Active * *****************************/ |
With the use of a case statement in the join condition, we can make use of the catch-all join functionality of the lookup table. Pretty cool, right?
To see this in action, take a look at this example in a SQL Fiddle!