I recently had a discussion with a colleague about the naming of tables and whether or not a plural table name makes sense. The argument for pluralizing had two interesting, competing viewpoints for which I’d like to discuss and give my reasoning why I prefer the singular form.
“A sock drawer contains socks. You wouldn’t label your sock drawer ‘sock’.”
“It’s not a ‘drawer of socks’, it is a sock drawer.”
Two competing arguments for the plurality of naming conventions.
Firstly, I’d like to point out that “socks” is inherently plural as they come in a pair; I would argue that this alone negates the entire conversation. However, I still think it’s a common struggle when naming tables and objects.
The question is: Do you look at a table from a holistic point of view, the collection as a whole, or do you look at it from the view of the records within the table, which make up the collection?
Recently, I had a situation where there were a multitude of ways a set of events could occur to a particular data point over time – where I needed to know what every possible pattern of events had occurred to that single data point. In this post, I’ll walk through the scenario of when you’d want to do something like this and how find patterns in a dataset.
Imagine you have a source system which allows a customer to interact with your front-end application, like updating their profile, and you want to know how they update their profile and all the different patterns in which they go about interacting with your system. To do that, you’ll need to recursively join to your data and build that pattern, in a set-based way, achieving superior performance with a very large dataset.
A common gripe I have when reviewing and maintaining other people’s SQL, aside from the obvious readability due to poor formatting, is their lack of aliases. In this post, I’d like to talk about why you should always reference aliases. The one major reason why, and the one minor reason when not to.
Referential Integrity in the Data Warehouse is a controversial topic amongst BI Professionals; you’re either all for it or all against it. While keeping the focus on RI, solely with regard to the Data Warehouse, I want to discuss some pros and cons to help you make a better, more informed decision that will have long-lasting impacts to your warehouse, development lifecycle, and production maintenance.
You need to understand that referential integrity is a decision, not a standard. Blindly implementing RI because you’re “supposed to” is a presumptuous, neglectful response to a complex problem. I think that everyone will agree that leaving constraints at the application layer is an apprentice mistake, nevertheless, when it comes to data it may be our best choice. Yet, still, having table-driven constraints is always preferred — if it makes sense!
When does it make sense to have Referential Integrity?