Dynamic Querying Using Block Quotes

Dynamic SQL, a photo of a single train track splitting into many tracks.
Dynamic SQL, a photo of a single train track splitting into many tracks.

In ETL, we often have to load many targets from a common set of base tables. Inevitably the targets are different enough that we have to create multiple queries or views to populate the many outputs of data. Which is fine, except now you’ve got yourself a maintenance nightmare, one that is avoidable. I’d like to share a trick with you to take a single query that can be recursively modified to dynamically change its structure to get different outputs. I call it, dynamic querying using block quotes.

The business case:

The sales department would like to take a single report that already exists and split it into two reports. The additional report will require different fitlers, aggregates, columns, and even joins!

Continue reading →

SQL Foo | Method to Find Data Patterns

When dealing with transactional data often there are many levels of granularity lying within. Finding these granularities exposes how your data is shaped as it accumulates and helps paint a better picture of what I like to call Lifes within the data. In this post, I want to share a technique I use to find data patterns which will be beneficial for everyone from the analyst to the architect.

Why do I refer to these data patterns as Lifes?

I haven’t found anything transactional in nature that doesn’t have some sort of recurring theme, with a distinct beginning and end, that couldn’t tell a story. It is these finite beginning/ends, start/stops, on/offs that paint the picture that is the “life” of the data. The life of these stories often have many sub-narratives and are interwoven within a single holistic life of the data. A great example is the familiar case of a customer purchase history. The customer is the holistic life of the data, their purchase orders, individual line items, and even a particular line item purchased over several purchase orders are all examples of sub-narratives within the story of a single customer.

Continue reading →

Properly Implementing Dynamic Dimension Descriptions

The topic of dynamic dimension descriptions isn’t new, but the method for which I’ve come up with is a hybrid of several other methodologies, making for a great subject piece. Using codes and descriptions in your warehouse dimensions is standard practice, even in some cases the use-case may require displaying a code versus a description, for example: displaying USA instead of United States of America, or even in situations where the business understands what a “TPS” report type is more so than the formal “Testing Procedure Specification” description. In any case, you’ll want a method that enables the ability to add and modify additional codes and their descriptions without having to perform risky manual updates to the warehouse.

In this post, I will discuss a rigid yet tolerant way for properly implementing dynamic dimension descriptions without ever directly modifying a warehouse dimensional table. Instead, we’ll implement a lookup table that an end user can insert and update freely, along with a robust ETL process that uses this same lookup table to perform description updates and even type 2 historical tracking, if necessary.

Dynamic Dimension Descriptions Using An ETL Lookup Table

Dynamic Dimension Descriptions diagram displaying how to load new, unknown codes.
Dynamic Dimension Descriptions diagram displaying how to load new, unknown codes.

Continue reading →

SQL Foo | Creative Use of Null Values

One of my favorite uses of set-data manipulation involves using NULL values to my advantage; from NULLIF to COALESCE, we’ll explore some creative use of null values. These tips & tricks aren’t just a way to convert a NULL to another value, they’re a multi-purpose, insanely powerful way to massage and combine data.

NULL Galaxy - Creative use of null values
NULL Galaxy – Creative use of null values

Level Set: What is a NULL?

In SQL a NULL value isn’t a value at all – it’s lack of value. It’s a value that is indicative of not having a value. Think of it as if you asked someone a question but they didn’t respond, their response was a NULL value. So, therefore you cannot compare someone’s non-response to someone else’s non-response, while they seem like the same answer they’re likely not even the same question!

There are some exceptions to this rule (thanks, Microsoft) but, like most of my blog posts, I try to stick with ANSI standard rules. While some RDBMSs treat NULL differently and even have switches that can be set during runtime to alter how NULL logic works, we’re not going to go there. Just assume ANSI 99.

Continue reading →

Interview Question | Weed Out Non Technical Candidates

Weed out non technical candidates by asking how to open notepad.
Weed out non technical candidates by asking how to open notepad.

I’ve since evolved a little since I posted this and would like to hone my focus of this post to a clearer target: Look For Ridiculous instead.

This is my hands-down, go-to interview question to weed out non-technical candidates. I have had such great success with this interview question and have shared it with other colleagues who have experienced equal success that I think it deserves a place on this blog. I’ve talked about this before, a little over a year ago is when I made my first public announcement of my new-found favorite interview question:

How do you open notepad on windows?

I typically preface the question with a short disclaimer like, “This question is a bit unorthodox and there really isn’t a wrong answer, having more than one answer is also equally acceptable…” This sets the stage for the actual question which I’ll phrase like “On a windows machine, regardless of version specifics, how do YOU open windows notepad?”

Continue reading →

SQL Foo | Count within a case statement

Just the other day I was asked to help a peer solve an SQL problem they were having; they were trying to count within a case statement which was proving to be problematic. They had several columns within the query, most using an analytical function to find the sum, min or max of various fields – but there was one instance where she needed to count the number of occurrences of a situation, a situation that required several other fields to determine, and without adding those other fields to the GROUP BY would break the logic.

Counting like a ninja with SQL.
Like a ninja, count within a case statement.

How to count within a case statement?

Continue reading →