Use Google Sheets to automate your data pipeline development:
This isn’t the first time I’ve talked about a code generator or SQL Generation on this blog, but it’s worth discussing again because I wanted to talk about a recent project where I upped the ante on not just generating SQL, but generating the DDL & DML to support an entire ETL pipeline — all while self-documenting everything!
Throughout the project I was able to quickly test different indexing strategies without writing a single line of code.
I’m aware that many of my readers of this site likely already know how to write sql to generate sql, or sql generators, but it’s not really much of a data blog if I don’t mention it. It’s also important to mention the time value this trick represents, with a few minutes of time, you can generate thousands of lines of code that would otherwise take hours to write. Let’s chat about one of the first things I learned about SQL which, no doubt, blew my mind:
Generate SQL with SQL.
Or to put another way, generating SQL statements via a single SQL command. Really cool stuff if you haven’t seen this before, keep reading!
Let’s say you wanted to find the number of badges for each type of badge in a table of badges. This is quite easy with a simple query, but let’s see how we can do it by writing SQL with SQL:
Writing SQL with SQL example INPUT.
+'''AS VARCHAR(50)), COUNT(1) FROM BADGES WHERE BADGES.NAME = '''