Am I the only one who can
see the numbers in this quote?
Quote:
The Destruction of Dresden was first published in Great Britain by William Kimber & Co. Ltd on April , ; in a revised and updated edition by Corgi Books Ltd in ; and by Papermac, a division of Macmillan Publishers Ltd, in .
|
They're Unicode characters U+F730 throuth U+F739 (for 0 through 9), which is in the "private use area" of Unicode. I don't know which fonts support them, but on my system, the fallback seems to be the Norasi Unicode fonts (designed for Thai).
Yes, their appearance is of "lowercase" or "old style" numerals.
It should be possible to do a find and replace on them using this information. Rather than starting with calibre, you might want to use
pdftohtml and pdfreflow to get an html file, and then find and replace in the html and then use calibre to convert to ePub. This is all scriptable. Something like:
Code:
#!/bin/bash
filename="$1"
pdftohtml -xml "$filename"
pdfreflow "${filename%.pdf}.xml"
sed -i -e 'y//0123456789/' "${filename%.pdf}.html"
ebook-convert "${filename%.pdf}.html" "${filename%.pdf}.epub"