Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Software > Calibre > Conversion

Notices

Reply
 
Thread Tools Search this Thread
Old 12-04-2010, 08:16 PM   #1
JGB
Groupie
JGB ought to be getting tired of karma fortunes by now.JGB ought to be getting tired of karma fortunes by now.JGB ought to be getting tired of karma fortunes by now.JGB ought to be getting tired of karma fortunes by now.JGB ought to be getting tired of karma fortunes by now.JGB ought to be getting tired of karma fortunes by now.JGB ought to be getting tired of karma fortunes by now.JGB ought to be getting tired of karma fortunes by now.JGB ought to be getting tired of karma fortunes by now.JGB ought to be getting tired of karma fortunes by now.JGB ought to be getting tired of karma fortunes by now.
 
Posts: 168
Karma: 1010000
Join Date: Jul 2008
Device: PRS505
PDF to epub

Quoted this from another thread, realized I shoulda asked here.

Quote:
Originally Posted by vastav View Post
You may want to try the Acrobat plugin based ePub conversion solution available at http://www.pdf2epub.com which recognizes PDF tags, reconstructs paragraphs and does a decent job in retaining most of the font and layout attributes.
Does this work better than caliber's internal PDF-EPUB?
Or is Caliber's built in PDF converter better? I kind of liked the idea of printing the PDF file to the correct size for my ebook with acrobat, and using that if it is more likely to remain clean and usable, but would doing this first improve the conversion any within Caliber? In the end I'll always use caliber for what I can since it just rocks, but I want to get the best quality EPUB I can from my .pdfs.
JGB is offline   Reply With Quote
Old 12-05-2010, 12:58 PM   #2
Agama
Guru
Agama ought to be getting tired of karma fortunes by now.Agama ought to be getting tired of karma fortunes by now.Agama ought to be getting tired of karma fortunes by now.Agama ought to be getting tired of karma fortunes by now.Agama ought to be getting tired of karma fortunes by now.Agama ought to be getting tired of karma fortunes by now.Agama ought to be getting tired of karma fortunes by now.Agama ought to be getting tired of karma fortunes by now.Agama ought to be getting tired of karma fortunes by now.Agama ought to be getting tired of karma fortunes by now.Agama ought to be getting tired of karma fortunes by now.
 
Agama's Avatar
 
Posts: 776
Karma: 2751519
Join Date: Jul 2010
Location: UK
Device: PW2, Nexus7
Quote:
Does this work better than caliber's internal PDF-EPUB?
Looks interesting. Why not have a play with it and report back to us with your findings?
Agama is offline   Reply With Quote
Advert
Old 01-22-2011, 04:00 AM   #3
vastav
Member
vastav began at the beginning.
 
Posts: 18
Karma: 38
Join Date: Sep 2009
Location: San Francisco Bay Area
Device: none
Quote:
Originally Posted by JGB View Post

Does this work better than caliber's internal PDF-EPUB?
Or is Caliber's built in PDF converter better? I kind of liked the idea of printing the PDF file to the correct size for my ebook with acrobat, and using that if it is more likely to remain clean and usable, but would doing this first improve the conversion any within Caliber? In the end I'll always use caliber for what I can since it just rocks, but I want to get the best quality EPUB I can from my .pdfs.
First the disclaimer - I am the creator of pdf2epub plugin for Acrobat.

The software should give you better results than Calibre on several dimensions - better retention of font properties (font class/ color/ size), better paragraph and other text structure detection, and retention of PDF bookmarks as navigation TOC in ePub.

The solution uses Tags in PDF to drive the conversion process. So if your sources are Tagged PDFs, the results will likely be substantially different than Calibre's.
vastav is offline   Reply With Quote
Old 10-11-2011, 11:44 AM   #4
Poppeye
Junior Member
Poppeye began at the beginning.
 
Posts: 4
Karma: 10
Join Date: Oct 2011
Device: Android
I've just tried pdf2epub for an untagged PDF that has a typical header/footer inserted by some kind of html2pdf generator which Acrobat only recognizes as artefacts. Both calibre and pdf2epub are able to convert to epub reasonably well, although the pdf2epub approach is much more intuitive and it conserves more formatting.

Both approaches have some shortcomings, too:

1. Calibre

I could get rid of the annoying extra text with the Search & Replace function. The result was pleasing, but Calibre still fell a bit short when it came to paragraph detection. Calibre would start a paragraph after ANY kind of punctuation, including ..., - and even '. If there was a way to exclude some punctuation from the detection mechanism, the result could be almost perfect.

2. PDF2EPUB

pdf2epub did all the work for me, I just had to mark the header and footer area as background and then export the document (Save as). However, the resulting epub showed a problem with paragraph detection when page breaks were present. pdf2epub would start a new paragraph on each new page, regardless of punctuation - probably because of the excluded header and footer. If that problem could be fixed, this would be a great solution.

As a private person I would probably not want to buy it, or at least not pay more than $5 dollars as I only use it for my private ebook collection and I have more time than money. Should you find yourself in a position to have to do this kind of work for your job, though, this is a god-sent timesaver and easily worth a goodly amount of money for the stress you just avoided.

Last edited by Poppeye; 10-11-2011 at 01:03 PM.
Poppeye is offline   Reply With Quote
Old 10-12-2011, 02:09 AM   #5
vastav
Member
vastav began at the beginning.
 
Posts: 18
Karma: 38
Join Date: Sep 2009
Location: San Francisco Bay Area
Device: none
Quote:
Originally Posted by Poppeye View Post
2. PDF2EPUB
pdf2epub did all the work for me, I just had to mark the header and footer area as background and then export the document (Save as). However, the resulting epub showed a problem with paragraph detection when page breaks were present. pdf2epub would start a new paragraph on each new page, regardless of punctuation - probably because of the excluded header and footer. If that problem could be fixed, this would be a great solution.
There is a simple solution for this - before using pdf2ePub solution, go to Document > Crop pages (in Acrobat 9) or Tools > Pages > Crop (in Acrobat 10) and crop all pages from top and bottom to eliminate the areas where header or footer appear. After this run Save As > ePub and you will find that the paragraphs across page boundaries are appropriately maintained. Hope this helps.
vastav is offline   Reply With Quote
Advert
Old 10-12-2011, 06:29 AM   #6
Poppeye
Junior Member
Poppeye began at the beginning.
 
Posts: 4
Karma: 10
Join Date: Oct 2011
Device: Android
Thanks, that was very helpful!
Poppeye is offline   Reply With Quote
Reply


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
PDF to EPUB yuxi_kelly Workshop 19 11-21-2010 08:38 AM
PDF to ePub wrenn1 Kobo Reader 4 07-22-2010 09:02 AM
PDF to EPUB var89 Calibre 7 02-14-2010 01:07 PM


All times are GMT -4. The time now is 02:45 AM.


MobileRead.com is a privately owned, operated and funded community.