View Single Post
Old 12-24-2017, 05:39 PM   #8
BetterRed
null operator (he/him)
BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.
 
Posts: 21,744
Karma: 30237526
Join Date: Mar 2012
Location: Sydney Australia
Device: none
Quote:
Originally Posted by retiredbiker View Post
When I find one of these old chestnuts, if it has "mso-whatever" all over it, it's usually a terrible mess. Maybe converted several times, as well - including to/from PDF. Often it's so bad I just have Calibre make an RTF out of it, and start over in LibreOffice and/or gedit - remove all the formatting, preserve italics if possible, and rebuild it. Often faster than trying to correct it in the Calibre editor.


I use Mammoth on DOCX's I convert from professionally created PDF public-domain documents from institutional sources (.gov, org, .edu etc). Trying to do the same with commercial PDF's is usually pointless.

Much converted texts invariably have boatloads of content errors too, such as OCR induced spelling errors, casing anomalies, broken paragraphs, quote marks all over the place - straight, bent, missing, mismatched, and superfluous etc.

Like you I prefer to start over, but in my case its always rather than often; I have 30 years worth of Word usage and macro/add-on gathering at my disposal.

I don't seek to create typographical replicates of the original. The reverse in fact, I remove embellishments such as graphical scene markers, first para start effects such as dropcaps etc - I do unindent them though. I also like to spell out chapter numbers - i.e. Twenty-two instead of 22.

BR

Last edited by BetterRed; 12-24-2017 at 05:42 PM.
BetterRed is online now   Reply With Quote