I recently had a discussion with a colleague about the naming of tables and whether or not a plural table name makes sense. The argument for pluralizing had two interesting, competing viewpoints for which I’d like to discuss and give my reasoning why I prefer the singular form.
“A sock drawer contains socks. You wouldn’t label your sock drawer ‘sock’.”
“It’s not a ‘drawer of socks’, it is a sock drawer.”
Two competing arguments for the plurality of naming conventions.
Firstly, I’d like to point out that “socks” is inherently plural as they come in a pair; I would argue that this alone negates the entire conversation. However, I still think it’s a common struggle when naming tables and objects.
The question is: Do you look at a table from a holistic point of view, the collection as a whole, or do you look at it from the view of the records within the table, which make up the collection?
Recently, I had a situation where there were a multitude of ways a set of events could occur to a particular data point over time – where I needed to know what every possible pattern of events had occurred to that single data point. In this post, I’ll walk through the scenario of when you’d want to do something like this and how find patterns in a dataset.
Imagine you have a source system which allows a customer to interact with your front-end application, like updating their profile, and you want to know how they update their profile and all the different patterns in which they go about interacting with your system. To do that, you’ll need to recursively join to your data and build that pattern, in a set-based way, achieving superior performance with a very large dataset.
CTEs are a powerful feature of modern RDBMSs which allow you to do some very creative things with set-based data. Some systems even allow you to nest them inside of themselves for even more crazy, creative solutions. Let’s discuss when to use a CTE.
The word “common” from the acronym CTE (Common Table Expression) means you want to use a query more than once — because it’s common.
Use Google Sheets to automate your data pipeline development:
This isn’t the first time I’ve talked about a code generator or SQL Generation on this blog, but it’s worth discussing again because I wanted to talk about a recent project where I upped the ante on not just generating SQL, but generating the DDL & DML to support an entire ETL pipeline — all while self-documenting everything!
Throughout the project I was able to quickly test different indexing strategies without writing a single line of code.
It’s almost that wonderful time of year when you’ll wait till the very last minute and blab on about all things you did, some excuses why you didn’t, and hope you didn’t leave out anything important. It’s performance review time so, I wanted to share a couple things that I do which help me be prepared and draft the best possible self-evaluation to maximize my reward.
A common gripe I have when reviewing and maintaining other people’s SQL, aside from the obvious readability due to poor formatting, is their lack of aliases. In this post, I’d like to talk about why you should always reference aliases. The one major reason why, and the one minor reason when not to.
When maintaining and improving upon legacy code it’s easy to redefine standards, or define nonexistent standards, which have cascading effects on the pipeline. Take something as simple as an ill-conceived naming convention which, in hindsight, turned out to not make sense over time. At some point the benefits of rewriting lots of code outweigh simply sticking with the bad convention. In these cases consistency over standards become the convention.
I recently went to a forum hosted by Women In Technology where they invited a large group of young women, some in high school up to some in their first internship out of college, who were there to meet one-on-one with tech professionals to ask questions and interact with some of the industry’s best (not self-promoting; there were many fine, far-more-seasoned-engineers at this event). Think speed-dating but for gathering knowledge, which I’ve dubbed: Women In Tech Speed-Mentoring.
I, of course, promoted data; I promoted the hell out of data!
The event itself was unfortunately quite short, for the mentees anyway, for which we all only spoke one-on-one with three aspiring tech geeks (plus a few chats while waiting in the pizza line). Some were declared computer science majors while others were still unsure of a direction to go in this broad world of tech. I, of course, promoted data; I promoted the hell out of data! However, in doing so, I was taken back by a couple things and pleasantly surprised at others — all of which gave me a chance to really connect with them, I hope. These surprises made me want to write this post…