View Single Post
Old 06-24-2010, 08:43 AM   #5
frabjous
Wizard
frabjous can solve quadratic equations while standing on his or her head reciting poetry in iambic pentameterfrabjous can solve quadratic equations while standing on his or her head reciting poetry in iambic pentameterfrabjous can solve quadratic equations while standing on his or her head reciting poetry in iambic pentameterfrabjous can solve quadratic equations while standing on his or her head reciting poetry in iambic pentameterfrabjous can solve quadratic equations while standing on his or her head reciting poetry in iambic pentameterfrabjous can solve quadratic equations while standing on his or her head reciting poetry in iambic pentameterfrabjous can solve quadratic equations while standing on his or her head reciting poetry in iambic pentameterfrabjous can solve quadratic equations while standing on his or her head reciting poetry in iambic pentameterfrabjous can solve quadratic equations while standing on his or her head reciting poetry in iambic pentameterfrabjous can solve quadratic equations while standing on his or her head reciting poetry in iambic pentameterfrabjous can solve quadratic equations while standing on his or her head reciting poetry in iambic pentameter
 
frabjous's Avatar
 
Posts: 1,213
Karma: 12890
Join Date: Feb 2009
Location: Amherst, Massachusetts, USA
Device: Sony PRS-505
Am I the only one who can see the numbers in this quote?

Quote:
The Destruction of Dresden was first published in Great Britain by William Kimber & Co. Ltd on April , ; in a revised and updated edition by Corgi Books Ltd in ; and by Papermac, a division of Macmillan Publishers Ltd, in .
They're Unicode characters U+F730 throuth U+F739 (for 0 through 9), which is in the "private use area" of Unicode. I don't know which fonts support them, but on my system, the fallback seems to be the Norasi Unicode fonts (designed for Thai).

Yes, their appearance is of "lowercase" or "old style" numerals.

It should be possible to do a find and replace on them using this information. Rather than starting with calibre, you might want to use pdftohtml and pdfreflow to get an html file, and then find and replace in the html and then use calibre to convert to ePub. This is all scriptable. Something like:

Code:
#!/bin/bash
filename="$1"
pdftohtml -xml "$filename" 
pdfreflow "${filename%.pdf}.xml"
sed -i -e 'y//0123456789/' "${filename%.pdf}.html"
ebook-convert "${filename%.pdf}.html" "${filename%.pdf}.epub"

Last edited by frabjous; 06-24-2010 at 09:00 AM.
frabjous is offline   Reply With Quote