View Single Post
Old 02-21-2008, 08:46 AM   #1
IceHand
Linux User
IceHand is a rising star in the heavensIceHand is a rising star in the heavensIceHand is a rising star in the heavensIceHand is a rising star in the heavensIceHand is a rising star in the heavensIceHand is a rising star in the heavensIceHand is a rising star in the heavensIceHand is a rising star in the heavensIceHand is a rising star in the heavensIceHand is a rising star in the heavensIceHand is a rising star in the heavens
 
IceHand's Avatar
 
Posts: 323
Karma: 13682
Join Date: Aug 2007
Location: Germany
Device: Kindle 3
Beautify Baen e-books

I have quite a few e-books from Baen and noticed that they use straight quotes instead of curly quotes. So I searched for a script or program that automatically converts the quotes, but couldn't find one that I could use. Then I discovered the program "sed" and made a dirty little one-liner that converts the HTML file from the exploded Baen LIT:
Code:
#!/bin/bash
set -e

mv "$1" "$1.backup"
cat "$1.backup" | sed -e 's|"\([^"][^"]*\)"|“\1”|g' -e 's|"|“|g' -e 's|=“[^”]*”|="\0"|g' -e 's|=“||g' -e 's|”"|"|g' -e 's|=”[^“]*“|="\0"|g' -e 's|=”||g' -e 's|“"|"|g' -e "s| '| ‘|g" -e "s|'|’|g" -e "s|“|\“|g" -e "s|”|\”|g" -e "s|‘|\‘|g" -e "s|’|\’|g" -e "s|\. \. \.|\…|g" -e "s|\.\.\.|\…|g" -e "s|\.\ \.\ \.|\…|g" -e "s|\. \. \. \.|\…|g" -e "s|\.\.\.\.|\…|g" -e "s|\.\ \.\ \.\ \.|\…|g" -e 's|\“+//|"+//|g' -e 's|//EN\”|//EN"|g' -e 's|\“http://openebook|"http://openebook|g' -e 's|\.dtd\”>|\.dtd">|g' > "$1"

exit 0
Maybe someone will find this useful. It converts straight quotes to curly quotes and ". . ." and ". . . ." to "…" (…) – and of course it makes a backup of the original HTML file.
I haven't tried, but there's sed for Windows too.

Are there other – maybe nicer – ways to do this task? My one-liner works well with Baen books, but has some limitations.

By the way, why is it that Baen Mobipocket e-books look nicer when exploding the MS Reader LIT and converting to Mobipocket yourself?

Last edited by IceHand; 02-22-2008 at 12:30 PM.
IceHand is offline   Reply With Quote