Quote:
Originally Posted by jhowell
Usually true. However an EPUB that was originally sourced from a PDF or scan may require extensive page by page clean up.
|
I rarely mess with PDFs. Sometimes I'll have a PDF that I really want to see on my eReader. But Kindles, and probably every other eReader, are horrid for viewing PDFs (larger tablets that you can quickly/easily zoom back and forth are much better).
I have been able to rescue some PDFs though. Most of the time the problem is that pages have a header and a footer after being converted to EPUBS, and these headers/footers don't match page boundaries on your eReader. I have had some luck in Calibre's editor after initial conversion to EPUB. Sometimes you can look through the book's markup text and come up with a regular expression that matches the header/footer. Then you can "search" for this regular expression and "delete" it wherever you find it. For all I know I am ending up with invalid EPUBs doing things like this. But if they display well enough on my device for me to read them, I call it good enough. I'm no publishing these for the general public.
I have not done this in eons. IIRC, I didn't have to do it on a page by page basis. But I don't think I could do it once, for the entire book in one fell swoop, either. But I really don't remember. What I think I remember is having to do it once for part0001,html, once for part0002.html, ... etc. However, I am not very good at book editing and I could have very easily missed "the easy way" to do this. It is doable my clunky way - for some books - if you are really determined to turn that PDF into an EPUB. And if you understand regular expression matching.