Consistency Over Standards

If our number one standard is consistency over standards then reviewing code becomes clearer.
Our number one standard is consistency over standards.

When maintaining and improving upon legacy code it’s easy to redefine standards, or define nonexistent standards, which have cascading effects on the pipeline. Take something as simple as an ill-conceived naming convention which, in hindsight, turned out to not make sense over time. At some point the benefits of rewriting lots of code outweigh simply sticking with the bad convention. In these cases consistency over standards become the convention.

Continue reading →

Join Two Type-2 Tables and Rebuild History

I recently was tasked with building a new table with two Type-2 tables as a source and, not only maintain the history but, rebuild the history. Let’s look at the best way to join two type-2 tables together and then, more specifically, how to join the two historized tables together while preserving the logical history of changes as though they were one.

Joining two type two tables with history maintained.
Join Two Type-2 Tables

The Setup:

Imagine you have two tables, typically type-2 dimensions, which have persisted historical records of changes: One for capturing your customer’s name over time, and another for tracking your customer’s phone number changes over time. In some cases, you may not have a historical record in one table, or the other, leaving an unknown name or phone number captured for a given customer. When re-building history you have to take care of these new periods of time where something hasn’t happened yet.

Now imagine that a customer changed her name three times and only on that third time did she also provide a phone number. That means, our third historical entry of her name change could potentially turn into five records, depending on when the phone number insert/update occurred.

Continue reading →

Every Other Week Flag for Date Dimension or Calendar Table

Data set of Every Other Week Flag using a Date Dimension
Data set of Every Other Week Flag using a Date Dimension

Regardless of your RDBMS you’ll find more than a handful of scripts online to add a Date Dimension or Calendar Table to your schema. One problem I’ve seen is they lack an every-other-week column. So, let’s explore a common SQL Server script and how to add an Every Other Week flag to our date dimension.

This method is written for Microsoft’s SQL Server, but it is ANSI standard and will work with any date dimension which has an integer day-of-week column, which I’ve never seen one that doesn’t…

Continue reading →

Dynamic Querying Using Block Quotes

Dynamic SQL, a photo of a single train track splitting into many tracks.
Dynamic SQL, a photo of a single train track splitting into many tracks.

In ETL, we often have to load many targets from a common set of base tables. Inevitably the targets are different enough that we have to create multiple queries or views to populate the many outputs of data. Which is fine, except now you’ve got yourself a maintenance nightmare, one that is avoidable. I’d like to share a trick with you to take a single query that can be recursively modified to dynamically change its structure to get different outputs. I call it, dynamic querying using block quotes.

The business case:

The sales department would like to take a single report that already exists and split it into two reports. The additional report will require different fitlers, aggregates, columns, and even joins!

Continue reading →

SQL Foo | Method to Find Data Patterns

When dealing with transactional data often there are many levels of granularity lying within. Finding these granularities exposes how your data is shaped as it accumulates and helps paint a better picture of what I like to call Lifes within the data. In this post, I want to share a technique I use to find data patterns which will be beneficial for everyone from the analyst to the architect.

Why do I refer to these data patterns as Lifes?

I haven’t found anything transactional in nature that doesn’t have some sort of recurring theme, with a distinct beginning and end, that couldn’t tell a story. It is these finite beginning/ends, start/stops, on/offs that paint the picture that is the “life” of the data. The life of these stories often have many sub-narratives and are interwoven within a single holistic life of the data. A great example is the familiar case of a customer purchase history. The customer is the holistic life of the data, their purchase orders, individual line items, and even a particular line item purchased over several purchase orders are all examples of sub-narratives within the story of a single customer.

Continue reading →

SQL Foo | Creative Use of Null Values

One of my favorite uses of set-data manipulation involves using NULL values to my advantage; from NULLIF to COALESCE, we’ll explore some creative use of null values. These tips & tricks aren’t just a way to convert a NULL to another value, they’re a multi-purpose, insanely powerful way to massage and combine data.

NULL Galaxy - Creative use of null values
NULL Galaxy – Creative use of null values

Level Set: What is a NULL?

In SQL a NULL value isn’t a value at all – it’s lack of value. It’s a value that is indicative of not having a value. Think of it as if you asked someone a question but they didn’t respond, their response was a NULL value. So, therefore you cannot compare someone’s non-response to someone else’s non-response, while they seem like the same answer they’re likely not even the same question!

There are some exceptions to this rule (thanks, Microsoft) but, like most of my blog posts, I try to stick with ANSI standard rules. While some RDBMSs treat NULL differently and even have switches that can be set during runtime to alter how NULL logic works, we’re not going to go there. Just assume ANSI 99.

Continue reading →