Shiny New E-Book Gizmo: The Amazon Kindle


View Full Version : Beautify Baen e-books


IceHand
02-21-2008, 07:46 AM
I have quite a few e-books from Baen and noticed that they use straight quotes instead of curly quotes. So I searched for a script or program that automatically converts the quotes, but couldn't find one that I could use. Then I discovered the program "sed" and made a dirty little one-liner that converts the HTML file from the exploded Baen LIT:
#!/bin/bash
set -e

mv "$1" "$1.backup"
cat "$1.backup" | sed -e 's|"\([^"][^"]*\)"|“\1”|g' -e 's|"|“|g' -e 's|=“[^”]*”|="\0"|g' -e 's|=“||g' -e 's|”"|"|g' -e 's|=”[^“]*“|="\0"|g' -e 's|=”||g' -e 's|“"|"|g' -e "s| '| ‘|g" -e "s|'|’|g" -e "s|“|\“|g" -e "s|”|\”|g" -e "s|‘|\‘|g" -e "s|’|\’|g" -e "s|\. \. \.|\…|g" -e "s|\.\.\.|\…|g" -e "s|\.\ \.\ \.|\…|g" -e "s|\. \. \. \.|\…|g" -e "s|\.\.\.\.|\…|g" -e "s|\.\ \.\ \.\ \.|\…|g" -e 's|\“+//|"+//|g' -e 's|//EN\”|//EN"|g' -e 's|\“http://openebook|"http://openebook|g' -e 's|\.dtd\”>|\.dtd">|g' > "$1"

exit 0
Maybe someone will find this useful. It converts straight quotes to curly quotes and ". . ." and ". . . ." to "…" (…) – and of course it makes a backup of the original HTML file.
I haven't tried, but there's sed for Windows too (http://gnuwin32.sourceforge.net/packages/sed.htm).

Are there other – maybe nicer – ways to do this task? My one-liner works well with Baen books, but has some limitations.

By the way, why is it that Baen Mobipocket e-books look nicer when exploding the MS Reader LIT and converting to Mobipocket yourself?

HarryT
02-21-2008, 08:57 AM
By the way, why is it that Baen Mobipocket e-books look nicer when exploding the MS Reader LIT and converting to Mobipocket yourself?

Until a couple of months ago Baen used a pretty bad Mobipocket creator - those are the files you get with a ".prc" extension. They've now started to use a much better converter and the new files (which have a ".mobi" extension) are very nice.