deft flux

A portal into the creative workings of David Meyer

SQL is a bad bad language

I knew this day would come.  The day I forget the "where" clause of an SQL delete or update statement and delete or update every row in a table.  Here's the code I used:

update tblNCM set CLOSED = 0, CLOSED_BY = null, CLOSED_DATE = null

Imagine my dismay when SQL Server Management Studio reported that this statement affected 970 rows!!  Here's what I meant to put:

update tblNCM set CLOSED = 0, CLOSED_BY = null, CLOSED_DATE = null where NCM_NO = 980

Yes, this was only supposed to affect one row.  So I inadvertently re-opened all of our NCMs and deleted all the closed by and closed date values.  Great.  All I have to say now is thank God for backups!  In this case, it might have cost me my job!  But, within 20 minutes I was able to restore yesterday's backup to a temporary database, copy all of yesterday's values, and check the change log for any NCMs that were closed since yesterday.  Whew!!  In any case, though, the real issue here is that it is too easy to affect every row in SQL.  For select statements, that's ok.  It doesn't cause any damage to select every row.  But had I designed SQL, I would have made it so you would have to explicitly indicate that every row is intended with update and delete statements instead of assuming it when the where clause is missing.  Something like:

update all MyTable set MyField = 'Value'

Or, even better from a programmer's mindset, make the where clause required:

update MyTable set MyField = 'Value' where true

Ok, ok, so "true" isn't exactly an SQL keyword, but it looks so much better than "where 1"...  In any case, I've never liked the SQL language as a language.  This is just one reason why.  It does not communicate intent very well.  For instance, look at my first code snippet above.  How well does that communicate that you intend to update every row?  From the way it looks to me, it seems like CLOSED, CLOSED_BY, and CLOSED_DATE would be scalar variables.  In any other programming language, when you put "set CLOSED = 0", CLOSED is a single variable with a single value.  There is no loop indicated in the language that would iterate every row and update multiple variables.  But when you say "where" afterward, then it implies that changes are made in possibly more than one place.  Omitting the "where" clause omits the part that implies that multiple rows are affected; therefore, the statement doesn't make sense.

"But I'm a lazy programmer," you might say, "and I don't want to have to write 'where true' every time I want to update or delete every row."  What, though, really, is more important?  Saving a few keystrokes?  Or avoiding a situation where a small oversight on your part could cost you your job during a time of economic hardship?  Or, from a less selfish standpoint, avoiding data loss, lost work, potential bugs, etc.?  Fortunately, we do daily backups.  But what if this was a database that was updated thousands of times a day?  Everyone's work on a particular table that day could potentially be lost because of a minor mistake.  Ugh.

Be the first to rate this post

  • Currently 0/5 Stars.
  • 1
  • 2
  • 3
  • 4
  • 5

Categories: Development
Posted by deftflux on Thursday, January 08, 2009 8:55 AM
Permalink | Comments (0) | Post RSSRSS comment feed