Posts Tagged ‘My worlds collide’

Eff Yeah-DBA!

Wednesday, October 2nd, 2013

In my day job, I’m a SQL Server DBA which means databases and data. At the enterprise. I know, most people find this mind-numbingly boring.

You all remember the Goodreads debacle? Where GR decided, without warning, to delete content they’d deemed inappropriate? They gave no notice to the user whose content it was, they just deleted it.

When I first heard the news, the DBA in me said, “I hope they just have a flag to set  rather than an actual deletion.” And also “I hope they took a backup first.”

I actually felt kind of sick, because 1) the data deletion in this case was so obviously the wrong way to go about this that recoverability should have been step 1. One of the primary rules of databases is you don’t delete content without being damned sure it MUST be deleted and that you can get it back if it turns out you’re wrong. Every DBA alive has done a data delete based on a business user’s insistence that YES! They have given you the correct information! And you say, OK, and five minutes later they say, “oh, wait.”

That’s why a good data architect, which happens to be a lot of my job right now, will often build in bit fields that represent something like Active/Inactive. That way, you can disable content without deleting it. Depends on the need, of course.

Data deletion can cause serious problems. Data integrity problems. History problems — like when someone asks, what did the data look like a week ago? That’s a question that often needs to be answered.

So, I’d been thinking all along that no sane DBA would actually do the deletion without making a backup first, if not of the database, at least of the affected tables, or, even, just the deleted data. SELECT * INTO [SOME TABLE] [CRITERIA].  (FYI, Goodreads is a SQL Server shop)

I have been in production emergencies where people are deciding things very quickly, and boy, all I can say is most DBAs I know would have done some kind of backup first. But maybe the boss was standing over the DBA’s shoulder saying, do it. Now. Or you’re fired.

That said, I find it interesting that GR is saying they are going to backups to get the data. Transactional databases are backed up on a schedule such that you can restore to a point in time. All GR would need is the backup sets for the day before and then, to do it right, the transaction logs up to the time right before the deletion. That’s the only way to guarantee you’d get all the data as it existed prior to deletion. We’re now 12 days out so they’ve probably had to go to offsite storage to get the old backups, though maybe they have that much on hand.

Or else the DBA has quietly said, “Here’s the data I backed up before we did this.”

Fuck Yeah, DBA.

Please remember to thank the DBA.

Edited to add: I should have mentioned, that here, they’d just restore to another server, or another instance of the DB, because they’re not restoring the deletions into prod, they’re giving the user the data that was deleted. The two letters I’ve seen all saw, “you still can’t post or re-post data that is in violation of the TOS” which I took to mean, if you want to re-post do so from the data we’re sending you.”

There are a bunch of ways to restore objects or deleted rows into current production, but it wouldn’t be easy and you’d probably need a 3rd party tool. Which is why, if I’m right, they’re sending the data to the user and letting them decide whether to re-post.



Why eBook Formatting Matters – A Case Study

Monday, May 28th, 2012

This is an image heavy post. I used my iPad3 to illustrate the problems.

Typesetting is a visual art. And, ironically, the best typeset books are ones where the art gets out of the way.

Berkley Books published my 2009 award winning historical Indiscreet. They did a fantastic job on the cover (Paul Marron!!!) and I love my editor. The entire Berkley team did a great job on editing, copy editing and proofreading. Berkley also eventually did an eBook. Since Berkley only has North American rights, I just recently issued an eBook outside the US where I have rights. And that means I have a case study about why eBook formatting matters and where the Berkley eBook falls very, very short. I heard from readers months ago that the eBook was practically unreadable. A while back I purchased the Kindle eBook and discovered it was true.

Some of you may know that I used to be a web developer in a previous job. That means I am very good with hmtl and css and, compared to the average author, quite knowledgeable about the guts of eBook technology. Am I bang up good at it? Not anymore, though I probably will be again shortly as there are lots of good reasons for me to brush up on my skills with respect to eBook formatting.

Here’s a screen cap of page 1 of Chapter 26 of the Berkley eBook for Indiscreet:


This looks OK, at first. But there are two HUGE typographic issues, one of which is better illustrated in the next image. The typographic errors are caused by the underlying html and css by the way, so to be accurate it’s not so much typography errors as coding errors that result in a poor reading experience. But typography gives us language that describes the problems.

In Indiscreet, I made the choice of opening many chapters with text that is descriptive of the chapter contents. It’s meta-text, in a way. Above, you can tell that first paragraph is indented text. You can tell that because there is text below that is not indented.

But this is an iPad screen that’s bigger than a lot of others. You can see more of the text per screen page. But what if you were reading this on an iPhone or other smaller screen? What visual clues exist to tell you that this is meta-text and not the actual writing? The indents are kind of a clue, but the white space issues make those indents less effective as a signifier. Notice the large spaces between paragraphs.

On the iPhone, the problem is even worse. You can only see a portion of any given paragraph.

So, the bad hmtl/css creates too much white space and that lessens the visual clue of indented text. Pages need white space. Too little white space is as serious a problem as too much.

Look at this image and you’ll get a better idea of the problem:

Landscape mode shows what happens with smaller screen real estate. On the left column, the indenting more or less vanishes. There’s no other text to show that this is an indented passage. What if your reading screen consisted ONLY of what you see in the left column?

The meta-text becomes indistinguishable from the actual text. The fact that the actual chapter text begins with CAPS isn’t a sufficient signifier, because, again, it signifies only when you can compare it to something else. A good typographical solution would embed the signification in the text itself. A different font entirely, for example.

Since this is the digital world where color doesn’t cost the publisher money, we should all be thinking about whether color might ever be of assistance. But that’s a digression. I’m just saying that in eBooks, black and white thinking might be limiting us in arriving at elegant solutions. Not that I don’t also recognize the peril of color in the hands of people who don’t understand color theory.

Anyway, if you are trying to read Indiscreet on a smaller screen, you will spend time dropping in and out of the story as the poor typography makes you struggle to figure out what is meta-text and what is regular text. As another aside, my meta-text authorial decision itself pulls the reader slightly out of the story flow and, boy, the bad presentation only exacerbates that problem.

Here’s what I did in my version:

Typography solved this issue centuries ago.

Notice the italics. It’s obvious that the italicized text is meta-text. In fact, this is a standard reason to use italics. With italics, it doesn’t matter if the reader sees only the meta-text. Italics alone tells the reader that they are not in the actual text.

Here’s an interior page from the Berkley eBook that shows why the formatting errors make their version of Indiscreet a chore to read:

Even on the iPad, with its bigger screen real estate, all that white space between paragraphs destroys the reading flow of the story. As you can see, it’s particularly bad when there are short paragraphs.

Here’s my version:

It makes a difference to the reader. A big difference. Page after page of paragraphs sitting in an ocean of white makes reading a chore.

You’ll notice that I don’t fully justify the text. I’m open to persuasion on this issue, but my current position is that when text flows to fit a changing container (landscape vs. portrait, iPad vs iPhone vs. Kindle Fire etc…) full justification will inevitably and unpredictably lead to lines of text that stretch in ways that slow down the eye and therefore the reading because of the insertion of white space. So, I left justify.

I also don’t include those pretty dingbatty things that start out chapters, even though it would be dead easy to do. Why? Mostly because the iPad background is actually not completely white and the dingbats show up on a whiter rectangular background and that bugs me. You can see the issue in the Berkley Chapter 26 image. That pretty sideways spade shows up on a whiter background. And it STILL bugs me. I’ve been mulling over various solutions to the problem, but I haven’t reached the point where no-dingbat vs. time to create perfectly invisible dingbat background has driven me to set aside time for a solution. I have a few ideas.

So. There you go.

I worked very hard to write a book readers would enjoy. And in the case of this eBook, it’s too much of a chore for many readers. They can’t read the story because of the presentation and that makes me sad.

Carolyn Talks Techie

Both P and DIV are block level elements that include padding by default. You must create and apply a style to control that padding if you don’t want that white space to appear.

p {
font-size: 100%;
margin: 0em 0em 0em 0em;
text-indent: 1.5em;
text-align: left;

Will create paragraphs that don’t include space before or after and that indents the first line

<p>Your fantastic text.</p>

Will look more like a book page than

<DIV>Your fantastic text.</DIV>

Which will actually render really badly.

DIV has more padding than P and when you stack a bunch of DIVs you get white space below the first div, then white space above the following div. Double the white space. The default padding for P is smaller than for div.

Therefore, if your program to convert your Word doc or PDF to html is creating DIV tags instead of P tags to contain your paragraphs of text it is breaking the html specification. If it also doesn’t even create a style sheet to fix the problem, your eBooks will suck.

And Now I Call Publishers out

Why are publishers using tools that create CRAP eBooks?
Why haven’t they hired anyone to fix it?
If they are out-sourcing this work and paying for it from the gross income, they’re getting ripped off. So is the author.

C-Level employees of publishers can be found all over the place talking about the ART of publishing and how much they care about the physical beauty of their product and how much they care about the reading experience.