Thread: PDF to epub
View Single Post
Old 10-11-2011, 11:44 AM   #4
Poppeye
Junior Member
Poppeye began at the beginning.
 
Posts: 4
Karma: 10
Join Date: Oct 2011
Device: Android
I've just tried pdf2epub for an untagged PDF that has a typical header/footer inserted by some kind of html2pdf generator which Acrobat only recognizes as artefacts. Both calibre and pdf2epub are able to convert to epub reasonably well, although the pdf2epub approach is much more intuitive and it conserves more formatting.

Both approaches have some shortcomings, too:

1. Calibre

I could get rid of the annoying extra text with the Search & Replace function. The result was pleasing, but Calibre still fell a bit short when it came to paragraph detection. Calibre would start a paragraph after ANY kind of punctuation, including ..., - and even '. If there was a way to exclude some punctuation from the detection mechanism, the result could be almost perfect.

2. PDF2EPUB

pdf2epub did all the work for me, I just had to mark the header and footer area as background and then export the document (Save as). However, the resulting epub showed a problem with paragraph detection when page breaks were present. pdf2epub would start a new paragraph on each new page, regardless of punctuation - probably because of the excluded header and footer. If that problem could be fixed, this would be a great solution.

As a private person I would probably not want to buy it, or at least not pay more than $5 dollars as I only use it for my private ebook collection and I have more time than money. Should you find yourself in a position to have to do this kind of work for your job, though, this is a god-sent timesaver and easily worth a goodly amount of money for the stress you just avoided.

Last edited by Poppeye; 10-11-2011 at 01:03 PM.
Poppeye is offline   Reply With Quote