View Single Post
Old 07-03-2014, 02:15 PM   #25
Hitch
Bookmaker & Cat Slave
Hitch ought to be getting tired of karma fortunes by now.Hitch ought to be getting tired of karma fortunes by now.Hitch ought to be getting tired of karma fortunes by now.Hitch ought to be getting tired of karma fortunes by now.Hitch ought to be getting tired of karma fortunes by now.Hitch ought to be getting tired of karma fortunes by now.Hitch ought to be getting tired of karma fortunes by now.Hitch ought to be getting tired of karma fortunes by now.Hitch ought to be getting tired of karma fortunes by now.Hitch ought to be getting tired of karma fortunes by now.Hitch ought to be getting tired of karma fortunes by now.
 
Hitch's Avatar
 
Posts: 11,503
Karma: 158448243
Join Date: Apr 2010
Location: Phoenix, AZ
Device: K2, iPad, KFire, PPW, Voyage, NookColor. 2 Droid, Oasis, Boox Note2
Quote:
Originally Posted by BobC View Post
@Hitch,


We have different needs - in your case you get pretty good original stuff (i.e. all the words are right) with perhaps poor formatting; a lot of mine starts off as really messy OCRd text from Archives.org and I need to do a lot of manual editing and re-formatting to knock it into shape. The extensibilty of LO makes it useful with some great regex based alternative S&R additions - I often use OOOFBTools just for its text-tidying ability (I've stopped using FB2 as my preferred Ebook format). As I use LO as my normal WP it all works seamlessly for me

Anyway, I have noticed that in LO some italics are not true italic - they look like italic but can't be found via the normal italic search; they seem to use a variant of "character posture" it may be ones like this that are getting mis-imported.

Beware also that the standard Writer2XTHML doesn't work with current editions of LO - you need to get a patched version - have a look at this :
https://www.mobileread.com/forums/sho...&postcount=224

BobC
BobC,

Thanks for the info. I'll look it over again. However, one small comment:

Quote:
...in your case you get pretty good original stuff (i.e. all the words are right) with perhaps poor formatting...


Umhmmmm. That would be nice. I wonder who gets that? Probably the companies who have contracts with BPH's. ;-)

We do get a ton of OCR'ed material, too. Not to mention, my fave: the DIY scans. Those are real doozies. Sort of like first-gen PG stuff. And I think I've posted the odd weird thing that crosses my desk (like that one file, with the pilcrows at the beginning of the lines...)

Hitch
Hitch is offline   Reply With Quote