08-23-2011, 03:15 AM | #1 |
Zealot
Posts: 107
Karma: 1000
Join Date: Sep 2010
Location: Melbourne, Australia
Device: iPad2, Kindle
|
Convert PDF to epub?
I have a few books only available to me as PDFs (and Quark files - but I don't have Quark).
Any recommended tutorials for converting PDFs to epubs? |
08-23-2011, 04:30 AM | #2 |
Wizard
Posts: 4,520
Karma: 121692313
Join Date: Oct 2009
Location: Heemskerk, NL
Device: PRS-T1, Kobo Touch, Kobo Aura
|
Converting PDF to ePUB is cumbersome to say the least. Your results are depending on the PDF of course. Sometimes Calibre gives reasonable results, as does some other tools. No tool gives a perfect conversion.
You could try to run the PDF through ABBYY, but that also would result in handwork. Common issues: - wrong OCR - paragraphs at the wrong location or not at all - totally wrong layout - missing text - headers and footers (including pagenumbers) throughout the text |
Advert | |
|
08-23-2011, 06:51 AM | #3 |
Color me gone
Posts: 2,089
Karma: 1445295
Join Date: Apr 2008
Location: Central Oregon Coast
Device: PRS-300
|
You can add duplicated caption text to the list of possible errors.
Some people use Mobipocket Creator and feed its html output into Calibre or Sigil. Whatever you do, plan on some work. BTW these only work if the PDFs have actual text and are not just containers for scans of the page. PDFs containing just images will have to be OCRed and the resulting product, often a mess, cleaned up. Then you get to appreciate that a 2% error rate means an error on every page times the number of pages to correct. |
08-23-2011, 08:02 AM | #4 |
Wizard
Posts: 4,520
Karma: 121692313
Join Date: Oct 2009
Location: Heemskerk, NL
Device: PRS-T1, Kobo Touch, Kobo Aura
|
Hmm, I have a less than 2% error rate with OCR, depending on the source. Sometimes even closer to 0,2%.
|
08-23-2011, 10:35 AM | #5 |
frumious Bandersnatch
Posts: 7,516
Karma: 18512745
Join Date: Jan 2008
Location: Spaniard in Sweden
Device: Cybook Orizon, Kobo Aura
|
An error of 2% means 1 error every 50... what? Every 50 characters? That's unacceptable. Every 50 pages? That's too good to be true. Every 50 lines? That could be.
|
Advert | |
|
08-23-2011, 06:17 PM | #6 | |
Addict
Posts: 351
Karma: 70000
Join Date: Jul 2010
Location: Australia
Device: ADE, iPad
|
Quote:
You do have to look for words with hard hypens like- this and paragraphs which start with lowercase letters, but GREP search in ID will get almost all of them, and it doesn't take long to go through one by one. Markzware make a Quark to ID converter which is a must, I use this to convert quark documents to ID. http://markzware.com/products/q2id/ |
|
08-26-2011, 01:39 AM | #7 |
Media Bloke
Posts: 2,381
Karma: 113956855
Join Date: Sep 2010
Location: NSW - Australia
Device: iOS
|
I just double click quark files and they open in indesign with no plugin
|
08-30-2011, 06:15 AM | #8 |
Hedge Wizard
Posts: 800
Karma: 19999999
Join Date: May 2011
Location: UK/Philippines
Device: Kobo Touch, Nook Simple
|
I have had reasonable results using Nitro PDF Viewer (freeware) to export the text in the pdf to a text file. I then coverted the file to html and then used Calibre to convert the html to epub.
|
08-30-2011, 06:56 AM | #9 |
Wizard
Posts: 4,520
Karma: 121692313
Join Date: Oct 2009
Location: Heemskerk, NL
Device: PRS-T1, Kobo Touch, Kobo Aura
|
Convert to a text file? How about the layout and characteristics like italic? You will lose those...
|
08-30-2011, 07:37 AM | #10 |
Addict
Posts: 351
Karma: 70000
Join Date: Jul 2010
Location: Australia
Device: ADE, iPad
|
Saving to a word file from Acrobat retains most of the formatting
|
08-30-2011, 07:48 AM | #11 | |
Hedge Wizard
Posts: 800
Karma: 19999999
Join Date: May 2011
Location: UK/Philippines
Device: Kobo Touch, Nook Simple
|
Quote:
With regard to italic, I cannot remember if detected it as most of the items I converted contained little or no italic characters. I anly use this method when other methods do not produce a satisfactory result Try it out and see what you think. I have really found the free Nitro PDF reader to be very good and I use it in preference to the Adobe offereings. |
|
08-30-2011, 08:30 AM | #12 |
Grand Sorcerer
Posts: 27,550
Karma: 193191846
Join Date: Jan 2010
Device: Nexus 7, Kindle Fire HD
|
I crop with Adobe Acrobat, then export as html and proceed to regex the piss out of it. I've also had decent luck with PDFMasher, but then you need to add images and a lot of formatting back in (and it's still pretty experimental). I think it's always going to be a fairly hands on affair to convert PDF's.
|
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
best way to convert PDF to ePUB - what do you think? | easyrider | Calibre | 50 | 12-29-2010 12:07 PM |
Would it be better if I convert pdf into epub? | fantasyvn | Sony Reader | 7 | 04-15-2010 07:43 AM |