Posts Tagged ‘DBA’

Eff Yeah-DBA!

Wednesday, October 2nd, 2013

In my day job, I’m a SQL Server DBA which means databases and data. At the enterprise. I know, most people find this mind-numbingly boring.

You all remember the Goodreads debacle? Where GR decided, without warning, to delete content they’d deemed inappropriate? They gave no notice to the user whose content it was, they just deleted it.

When I first heard the news, the DBA in me said, “I hope they just have a flag to set  rather than an actual deletion.” And also “I hope they took a backup first.”

I actually felt kind of sick, because 1) the data deletion in this case was so obviously the wrong way to go about this that recoverability should have been step 1. One of the primary rules of databases is you don’t delete content without being damned sure it MUST be deleted and that you can get it back if it turns out you’re wrong. Every DBA alive has done a data delete based on a business user’s insistence that YES! They have given you the correct information! And you say, OK, and five minutes later they say, “oh, wait.”

That’s why a good data architect, which happens to be a lot of my job right now, will often build in bit fields that represent something like Active/Inactive. That way, you can disable content without deleting it. Depends on the need, of course.

Data deletion can cause serious problems. Data integrity problems. History problems — like when someone asks, what did the data look like a week ago? That’s a question that often needs to be answered.

So, I’d been thinking all along that no sane DBA would actually do the deletion without making a backup first, if not of the database, at least of the affected tables, or, even, just the deleted data. SELECT * INTO [SOME TABLE] [CRITERIA].  (FYI, Goodreads is a SQL Server shop)

I have been in production emergencies where people are deciding things very quickly, and boy, all I can say is most DBAs I know would have done some kind of backup first. But maybe the boss was standing over the DBA’s shoulder saying, do it. Now. Or you’re fired.

That said, I find it interesting that GR is saying they are going to backups to get the data. Transactional databases are backed up on a schedule such that you can restore to a point in time. All GR would need is the backup sets for the day before and then, to do it right, the transaction logs up to the time right before the deletion. That’s the only way to guarantee you’d get all the data as it existed prior to deletion. We’re now 12 days out so they’ve probably had to go to offsite storage to get the old backups, though maybe they have that much on hand.

Or else the DBA has quietly said, “Here’s the data I backed up before we did this.”

Fuck Yeah, DBA.

Please remember to thank the DBA.

Edited to add: I should have mentioned, that here, they’d just restore to another server, or another instance of the DB, because they’re not restoring the deletions into prod, they’re giving the user the data that was deleted. The two letters I’ve seen all saw, “you still can’t post or re-post data that is in violation of the TOS” which I took to mean, if you want to re-post do so from the data we’re sending you.”

There are a bunch of ways to restore objects or deleted rows into current production, but it wouldn’t be easy and you’d probably need a 3rd party tool. Which is why, if I’m right, they’re sending the data to the user and letting them decide whether to re-post.