Home » T-SQL Tuesday #194 – learning from mistakes

T-SQL Tuesday #194 – learning from mistakes

by Vlad Drumea
0 comments 5 minutes read

This is my first time participating in T-SQL Tuesday, and Louis Davidson‘s invitation about learning from mistakes seemed like the perfect opportunity.

t-sql Tuesday logo

Backstory

To no one’s surprise, I wasn’t always a DBA. 14 years ago, I started out as customer support for a local ISV.

The company I worked for had a customer that, at the time, was one of Romania’s biggest tourism agencies, and I was assigned as their main point of contact after I’d previously helped resolve a few issues.

At some point I was tasked with work that became gradually more SQL Server focused.
I started reading up on how to manage and troubleshoot various SQL Server issues.
This is how I first discovered Brent Ozar and Pinal Dave, who during that time have saved my bacon more times than I could count.

Since I had full access to that customer’s environment, I’ve also started setting up SQL Server backups and addressing configuration issues that were impacting the database used by my employer’s tourism solution.

The mistake

If I recall correctly, this was after a new release of a sales-force automation software that was used by food and beverage vendors across the country, so it was one of those busy days with somewhat displeased folks calling and emailing.

While juggling phone calls with other customers, I’ve received a request from the tourism agency to set the birth date of a 2 specific customers to NULL.
These were “system customers” and weren’t actual people.

At this point you’re probably seeing where this is going.

In a hurry to get that sorted out, I’ve managed to do the “oops… update” that pretty much anyone working with databases has done at one point or another.

Obviously, I’ve realized what I’ve done as soon as I saw the “[huge number] rows affected” message.

Panic! (no, not at the disco)

At this point the gravity of the situation was setting in. The fact that all of my colleagues were engaged in calls with other customers just added to the feeling of anxiety.

I also didn’t want to inform anyone of what I’ve done because I was certain I would have, at best, been barred from working with anything SQL Server related for the foreseeable future.

I knew there’s no way to directly “undo” updates that aren’t in an active explicit transaction.
So, instead of googling “how to undo update SQL Server”, I just stared blankly at the screen for what felt like an eternity.

A moment of clarity

After what was most likely a minute of staring at the screen, I’ve realized that I had a valid recent backup to work with.

And, more importantly, the source database was fairly small (~12GB at the time), which meant a quick restore time.

A good first step

The backup wasn’t from right before the update, but, at worst, using it would have only left me with ~2 hours worth of potential new customers with NULLed birth dates.

I restore the backup as a separate database on the same instance, and I do a cross-database update.

This fixed the bulk of the records in the Customers table, but there were still those potential customers that weren’t captured in that backup.

Taking care of the stragglers

During a quick check I’ve noticed that there were around 30 customers that still had their birth dates marked as NULL even though I was certain they had originally filled them.

In Romania we have something called Cod Numeric Personal, or CNP for short. This stands for Personal Numerical Code and is our equivalent of the American SSN.

The pros of a CNP in this case:

  • Every person/customer has one.
  • Tourism agencies generally require customers to provide their CNPs, especially for traveling outside the country.
  • The CNP contains the birth date of the person it’s assigned to.

Here’s where the pretty much non-existing data protection regulations of those times come in handy.
With modern data protection regulations, no company would store CNPs in clear text, but this was 2013 and data protection was a foreign concept for pretty much everyone.

In a couple of minutes I was able to write an UPDATE statement that set the remaining customers’ birth dates to the dates extracted from their CNPs.

The result

I was able to fix my mess in under 10 minutes and nobody seemed to have noticed.

After everything was fixed, I’ve managed to update those two rows while being very careful to first write a SELECT with a WHERE clause that matched the records to make sure those are the only two targeted.
After confirming, I’ve wrapped the SELECT in a transaction with the COMMIT a few lines down, and then I replaced the SELECT with the relevant UPDATE statement.

Looking back

This was probably the best context for such a mistake.
It was in a production environment, so there were actual stakes at play. But the environment wasn’t so large or busy that it would make fixing it a lengthy task.
And, more importantly, I had a backup that was valid and that I could restore.

After this, I’ve developed a healthy level of paranoia when it came to touching production environments.

I’ve learned to:

  • Always double‑check before changing something.
  • Make sure there are viable backups available.
  • Create backup copies of tables before major changes so I won’t have to restore an entire backup for a single table.
  • Script out my changes beforehand.
    This makes it easier to spot any potential mistakes early in the process as well as track what was done.
  • Prepare scripts to undo/fix any of the changes in case something goes wrong.
    This way I wouldn’t have to scramble under pressure and hack something together to revert my changes.

Conclusion

This was my first, and hopefully not last, contribution to T-SQL Tuesday.
I hope you’ve learned something useful from my mistake.

You may also like

Leave a Comment

* By using this form you agree with the storage and handling of your data by this website.

This site uses Akismet to reduce spam. Learn how your comment data is processed.