Recently, I had a situation where there were a multitude of ways a set of events could occur to a particular data point over time – where I needed to know what every possible pattern of events had occurred to that single data point. In this post, I’ll walk through the scenario of when you’d want to do something like this and how find patterns in a dataset.
Imagine you have a source system which allows a customer to interact with your front-end application, like updating their profile, and you want to know how they update their profile and all the different patterns in which they go about interacting with your system. To do that, you’ll need to recursively join to your data and build that pattern, in a set-based way, achieving superior performance with a very large dataset.
CTEs are a powerful feature of modern RDBMSs which allow you to do some very creative things with set-based data. Some systems even allow you to nest them inside of themselves for even more crazy, creative solutions. Let’s discuss when to use a CTE.
The word “common” from the acronym CTE (Common Table Expression) means you want to use a query more than once — because it’s common.
Use Google Sheets to automate your data pipeline development:
This isn’t the first time I’ve talked about a code generator or SQL Generation on this blog, but it’s worth discussing again because I wanted to talk about a recent project where I upped the ante on not just generating SQL, but generating the DDL & DML to support an entire ETL pipeline — all while self-documenting everything!
Throughout the project I was able to quickly test different indexing strategies without writing a single line of code.
It’s almost that wonderful time of year when you’ll wait till the very last minute and blab on about all things you did, some excuses why you didn’t, and hope you didn’t leave out anything important. It’s performance review time so, I wanted to share a couple things that I do which help me be prepared and draft the best possible self-evaluation to maximize my reward.
A common gripe I have when reviewing and maintaining other people’s SQL, aside from the obvious readability due to poor formatting, is their lack of aliases. In this post, I’d like to talk about why you should always reference aliases. The one major reason why, and the one minor reason when not to.
When maintaining and improving upon legacy code it’s easy to redefine standards, or define nonexistent standards, which have cascading effects on the pipeline. Take something as simple as an ill-conceived naming convention which, in hindsight, turned out to not make sense over time. At some point the benefits of rewriting lots of code outweigh simply sticking with the bad convention. In these cases consistency over standards become the convention.
I recently went to a forum hosted by Women In Technology where they invited a large group of young women, some in high school up to some in their first internship out of college, who were there to meet one-on-one with tech professionals to ask questions and interact with some of the industry’s best (not self-promoting; there were many fine, far-more-seasoned-engineers at this event). Think speed-dating but for gathering knowledge, which I’ve dubbed: Women In Tech Speed-Mentoring.
I, of course, promoted data; I promoted the hell out of data!
The event itself was unfortunately quite short, for the mentees anyway, for which we all only spoke one-on-one with three aspiring tech geeks (plus a few chats while waiting in the pizza line). Some were declared computer science majors while others were still unsure of a direction to go in this broad world of tech. I, of course, promoted data; I promoted the hell out of data! However, in doing so, I was taken back by a couple things and pleasantly surprised at others — all of which gave me a chance to really connect with them, I hope. These surprises made me want to write this post…
After a lot of swearing I decided to make this post, dynamic column names in SSRS 2005, to help the internet and give back to the huddled masses who still have to support this old reporting services instance from 2005. In this version, as all of my dead-end googling revealed, there is no way to dynamically or ordinal-positionally set the column names in your reports — which is insane! I mean, what year were Microsoft in, in 2005?
Anway, like I said, after a lot of swearing, I came up with a creative way to achieve dynamic column names in SSRS 2005 and this post will share how you can implement the same solution.
Don’t even get me started on dynamic data sources…