View Single Post
Old 01-14-2021, 06:24 PM   #10
davidfor
Grand Sorcerer
davidfor ought to be getting tired of karma fortunes by now.davidfor ought to be getting tired of karma fortunes by now.davidfor ought to be getting tired of karma fortunes by now.davidfor ought to be getting tired of karma fortunes by now.davidfor ought to be getting tired of karma fortunes by now.davidfor ought to be getting tired of karma fortunes by now.davidfor ought to be getting tired of karma fortunes by now.davidfor ought to be getting tired of karma fortunes by now.davidfor ought to be getting tired of karma fortunes by now.davidfor ought to be getting tired of karma fortunes by now.davidfor ought to be getting tired of karma fortunes by now.
 
Posts: 24,905
Karma: 47303824
Join Date: Jul 2011
Location: Sydney, Australia
Device: Kobo:Touch,Glo, AuraH2O, GloHD,AuraONE, ClaraHD, Libra H2O; tolinoepos
Quote:
Originally Posted by 4rm View Post
Thanks for pointing me to ScrambleEbook. I've attached the scrambled ebooks below. I tested the scambled ebooks' compatibility, and it mirrors the unscrambled ebooks: The original and Twice_converted ebooks fail to open and crash the ereader, and the kepub opens and runs correctly.
I have to say "WOW!!!!" In the last few days I looked at a book and declared it was the worst coded book I have ever seen. Compared to this one, that book was well coded.

I looked at the epub version before putting it on one of my ereaders. It was obvious what the problem was. Everything is wrapped in spans. Almost every word is in its own span. Almost every space was in its own span. And there are over 42000 classes in the stylesheet. 35693 of them are named "text_nnnnn" and are used in those spans. Most of these are almost identical. They set colours that are almost identical, or set the letter spacing or something else that probably isn't needed.

I am completely sure that this is failing because it is just taking to long to render the book because there is just to much in it. I have it opened in the calibre editor and have been running a "beautify" on it for the last 20 minutes. That's on a laptop with an i7 and 16GB of memory.

You said it was a OCRed from scans. I think you need to look at the options in the OCR program and turn off whatever is doing the above. The text colour and letter spacing isn't needed. Or maybe have it produce an DOCX or similar and do the cleanup there. You can edit the epub, but, it will take time.
davidfor is offline   Reply With Quote