Friday, April 13, 2012

Have a primary key, for Christ's sake!

This posting is a venting of my frustration.

Never ever agree to have a table that doesn't have a primary key. There may be forces of nature that might attempt to convince you to have one, like:

  • Your peers might say you don't need to perform an UPDATE in that table, now or ever
  • You just need that table only for audit or reporting purposes
  • We store data in that table just to go back and refer later which may never happen, so don't bother
  • You may be under a gun where the big boss says "you don't need a primary key, because I say so!"
For Christ's sake, don't yield to any of these things. I was in a limbo recently where I had to make use of one of the legacy tables that didn't have a primary key. This table was once thought as insignificant, but now has suddenly become an important table. The issue is that, for me to perform the operations that I want to perform, I need a primary key.

The issue is that this monster table has more than 40 million rows. Eventually I needed to create a primary key (ID) and back fill all the rows using a sequence. Had to add a trigger to make sure the ID column is populated every time when the application INSERTs a row into that table in future. It was unnecessary hassle.

So, my friend, here are my two cents:
  • Always make sure your table has a primary key. It is very cheap to have one right from the beginning, and populate it using AUTO_INCREMENT (MySQL) or triggers (Oracle), rather than attempting to fix later.
  • If you will perform UPDATEs in a table, make sure you have VERSION column. This is important since it will help you in performing optimistic locking. And if you have more than one instance of application running, it definitely helps. I have seen people using LAST_UPDATE_TIME for optimistic locking. Though that works, it is error prone.
Versioning is an important aspect, which gets easily overlooked. Today you may have only one Tomcat (or any application server) running. Say your product becomes a hit and you need to double the number of application servers. Versioning helps you to perform optimistic locking, and ensure you don't screw up the integrity of data.

No comments: