View Single Post
Old 03-27-2013, 11:19 PM   #3
davidfor
Grand Sorcerer
davidfor ought to be getting tired of karma fortunes by now.davidfor ought to be getting tired of karma fortunes by now.davidfor ought to be getting tired of karma fortunes by now.davidfor ought to be getting tired of karma fortunes by now.davidfor ought to be getting tired of karma fortunes by now.davidfor ought to be getting tired of karma fortunes by now.davidfor ought to be getting tired of karma fortunes by now.davidfor ought to be getting tired of karma fortunes by now.davidfor ought to be getting tired of karma fortunes by now.davidfor ought to be getting tired of karma fortunes by now.davidfor ought to be getting tired of karma fortunes by now.
 
Posts: 24,907
Karma: 47303748
Join Date: Jul 2011
Location: Sydney, Australia
Device: Kobo:Touch,Glo, AuraH2O, GloHD,AuraONE, ClaraHD, Libra H2O; tolinoepos
When I fix books like this, I use Sigil. That allows me to see the epub and do search and replace across all files in it. For a book like this that probably an OCRed scan, that means I can fix the repeated mistakes easily. It also has a spelling checker, and you can add words or names to the list to ignore.

I use Tweak books in calibre for simple things. Mainly to edit the stylesheet or a single spelling mistake. Don't forget to press the "Rebuild" button. I've hit the cancel button or escape key a few times without thinking and wondered where the changes went.

Having the <p> tags start in a new line won't add spaces. From what I can see, the extra spaces are between the <p> and </p>. There are a lot wrapping the quotes and other punctuation. I know the first change I would be would be a global change of

Code:
<p> "
to
Code:
<p>"
And similar for quotes at the end of the paragraph. Fixing the broken paragraphs is a bit harder. Generally I only pick them up as I read. But, Sigil does have a saved search that can help. From memory, it looks for any end paragraph tags that have letter immediately before it.

The "<div class="newpage" id="page-405"/>" have probably been put there by the person who created the file to map back to the original book. The desktop ADE didn't seem to use them. As there isn't a definition of the "newpage" class, I don't think it will do anything. I have seen a similar thing in other books but they used an anchor tag.

Just remembered: I looked in the MR library for a copy of this. There is only a German version. Unless you are trying to match a paper copy you have, it might be useful to look at it to get ideas of the styles used and in to compare the punctuation.

Last edited by davidfor; 03-27-2013 at 11:36 PM. Reason: Accidently hot the submit button.
davidfor is offline   Reply With Quote