View Single Post
Old 09-23-2022, 03:07 AM   #9
haertig
Wizard
haertig ought to be getting tired of karma fortunes by now.haertig ought to be getting tired of karma fortunes by now.haertig ought to be getting tired of karma fortunes by now.haertig ought to be getting tired of karma fortunes by now.haertig ought to be getting tired of karma fortunes by now.haertig ought to be getting tired of karma fortunes by now.haertig ought to be getting tired of karma fortunes by now.haertig ought to be getting tired of karma fortunes by now.haertig ought to be getting tired of karma fortunes by now.haertig ought to be getting tired of karma fortunes by now.haertig ought to be getting tired of karma fortunes by now.
 
Posts: 1,430
Karma: 17928604
Join Date: Sep 2017
Device: PW3, Fire HD8 Gen7, Moto G7, Sansa Clip v2, Ruizu X26
Quote:
Originally Posted by jhowell View Post
Usually true. However an EPUB that was originally sourced from a PDF or scan may require extensive page by page clean up.
I rarely mess with PDFs. Sometimes I'll have a PDF that I really want to see on my eReader. But Kindles, and probably every other eReader, are horrid for viewing PDFs (larger tablets that you can quickly/easily zoom back and forth are much better).

I have been able to rescue some PDFs though. Most of the time the problem is that pages have a header and a footer after being converted to EPUBS, and these headers/footers don't match page boundaries on your eReader. I have had some luck in Calibre's editor after initial conversion to EPUB. Sometimes you can look through the book's markup text and come up with a regular expression that matches the header/footer. Then you can "search" for this regular expression and "delete" it wherever you find it. For all I know I am ending up with invalid EPUBs doing things like this. But if they display well enough on my device for me to read them, I call it good enough. I'm no publishing these for the general public.

I have not done this in eons. IIRC, I didn't have to do it on a page by page basis. But I don't think I could do it once, for the entire book in one fell swoop, either. But I really don't remember. What I think I remember is having to do it once for part0001,html, once for part0002.html, ... etc. However, I am not very good at book editing and I could have very easily missed "the easy way" to do this. It is doable my clunky way - for some books - if you are really determined to turn that PDF into an EPUB. And if you understand regular expression matching.
haertig is offline   Reply With Quote