| 
			
			 | 
		#1 | 
| 
			
			
			
			 Zealot 
			
			![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 107 
				Karma: 1000 
				Join Date: Sep 2010 
				Location: Melbourne, Australia 
				
				
				Device: iPad2, Kindle 
				
				
				 | 
	
	
	
		
		
			
			 
				
				Convert PDF to epub?
			 
			
			
			I have a few books only available to me as PDFs (and Quark files - but I don't have Quark). 
		
	
		
		
		
		
		
		
		
		
		
		
	
	Any recommended tutorials for converting PDFs to epubs?  | 
| 
		 | 
	
	
	
		
		
		
		
			 
		
		
		
		
		
		
		
			
		
		
		
	 | 
| 
			
			 | 
		#2 | 
| 
			
			
			
			 Wizard 
			
			![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 4,520 
				Karma: 121692313 
				Join Date: Oct 2009 
				Location: Heemskerk, NL 
				
				
				Device: PRS-T1, Kobo Touch, Kobo Aura 
				
				
				 | 
	
	
	
		
		
		
		
		 
			
			Converting PDF to ePUB is cumbersome to say the least. Your results are depending on the PDF of course. Sometimes Calibre gives reasonable results, as does some other tools. No tool gives a perfect conversion. 
		
	
		
		
		
		
		
		
		
		
		
		
	
	You could try to run the PDF through ABBYY, but that also would result in handwork. Common issues: - wrong OCR - paragraphs at the wrong location or not at all - totally wrong layout - missing text - headers and footers (including pagenumbers) throughout the text  | 
| 
		 | 
	
	
	
		
		
		
		
			 
		
		
		
		
		
		
		
			
		
		
		
	 | 
| Advert | |
| 
         | 
    
| 
			
			 | 
		#3 | 
| 
			
			
			
			 Color me gone 
			
			![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 2,089 
				Karma: 1445295 
				Join Date: Apr 2008 
				Location: Central Oregon Coast 
				
				
				Device: PRS-300 
				
				
				 | 
	
	
	
		
		
		
		
		 
			
			You can add duplicated caption text to the list of possible errors. 
		
	
		
		
		
		
		
		
		
		
		
		
	
	Some people use Mobipocket Creator and feed its html output into Calibre or Sigil. Whatever you do, plan on some work. BTW these only work if the PDFs have actual text and are not just containers for scans of the page. PDFs containing just images will have to be OCRed and the resulting product, often a mess, cleaned up. Then you get to appreciate that a 2% error rate means an error on every page times the number of pages to correct.  | 
| 
		 | 
	
	
	
		
		
		
		
			 
		
		
		
		
		
		
		
			
		
		
		
	 | 
| 
			
			 | 
		#4 | 
| 
			
			
			
			 Wizard 
			
			![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 4,520 
				Karma: 121692313 
				Join Date: Oct 2009 
				Location: Heemskerk, NL 
				
				
				Device: PRS-T1, Kobo Touch, Kobo Aura 
				
				
				 | 
	
	
	
		
		
		
		
		 
			
			Hmm, I have a less than 2% error rate with OCR, depending on the source. Sometimes even closer to 0,2%.
		 
		
	
		
		
		
		
		
		
		
		
		
		
	
	 | 
| 
		 | 
	
	
	
		
		
		
		
			 
		
		
		
		
		
		
		
			
		
		
		
	 | 
| 
			
			 | 
		#5 | 
| 
			
			
			
			 frumious Bandersnatch 
			
			![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 7,570 
				Karma: 20150435 
				Join Date: Jan 2008 
				Location: Spaniard in Sweden 
				
				
				Device: Cybook Orizon, Kobo Aura 
				
				
				 | 
	
	
	
		
		
		
		
		 
			
			An error of 2% means 1 error every 50... what? Every 50 characters? That's unacceptable. Every 50 pages? That's too good to be true. Every 50 lines? That could be.
		 
		
	
		
		
		
		
		
		
		
		
		
		
	
	 | 
| 
		 | 
	
	
	
		
		
		
		
			 
		
		
		
		
		
		
		
			
		
		
		
	 | 
| Advert | |
| 
         | 
    
| 
			
			 | 
		#6 | |
| 
			
			
			
			 Addict 
			
			![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 351 
				Karma: 70000 
				Join Date: Jul 2010 
				Location: Australia 
				
				
				Device: ADE, iPad 
				
				
				 | 
	
	
	
		
		
		
		
		 Quote: 
	
 You do have to look for words with hard hypens like- this and paragraphs which start with lowercase letters, but GREP search in ID will get almost all of them, and it doesn't take long to go through one by one. Markzware make a Quark to ID converter which is a must, I use this to convert quark documents to ID. http://markzware.com/products/q2id/  | 
|
| 
		 | 
	
	
	
		
		
		
		
			 
		
		
		
		
		
		
		
			
		
		
		
	 | 
| 
			
			 | 
		#7 | 
| 
			
			
			
			 Media Bloke 
			
			![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 2,382 
				Karma: 113956855 
				Join Date: Oct 2010 
				Location: NSW - Australia 
				
				
				Device: iOS 
				
				
				 | 
	
	
	
		
		
		
		
		 
			
			I just double click quark files and they open in indesign with no plugin
		 
		
	
		
		
		
		
		
		
		
		
		
		
	
	 | 
| 
		 | 
	
	
	
		
		
		
		
			 
		
		
		
		
		
		
		
			
		
		
		
	 | 
| 
			
			 | 
		#8 | 
| 
			
			
			
			 Hedge Wizard 
			
			![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 802 
				Karma: 19999999 
				Join Date: May 2011 
				Location: UK/Philippines 
				
				
				Device: Kobo Touch, Nook Simple 
				
				
				 | 
	
	
	
		
		
		
		
		 
			
			I have had reasonable results using Nitro PDF Viewer (freeware) to export the text in the pdf to a text file.  I then coverted the file to html and then used Calibre to convert the html to epub.
		 
		
	
		
		
		
		
		
		
		
		
		
		
	
	 | 
| 
		 | 
	
	
	
		
		
		
		
			 
		
		
		
		
		
		
		
			
		
		
		
	 | 
| 
			
			 | 
		#9 | 
| 
			
			
			
			 Wizard 
			
			![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 4,520 
				Karma: 121692313 
				Join Date: Oct 2009 
				Location: Heemskerk, NL 
				
				
				Device: PRS-T1, Kobo Touch, Kobo Aura 
				
				
				 | 
	
	
	
		
		
		
		
		 
			
			Convert to a text file? How about the layout and characteristics like italic? You will lose those...
		 
		
	
		
		
		
		
		
		
		
		
		
		
	
	 | 
| 
		 | 
	
	
	
		
		
		
		
			 
		
		
		
		
		
		
		
			
		
		
		
	 | 
| 
			
			 | 
		#10 | 
| 
			
			
			
			 Addict 
			
			![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 351 
				Karma: 70000 
				Join Date: Jul 2010 
				Location: Australia 
				
				
				Device: ADE, iPad 
				
				
				 | 
	
	
	
		
		
		
		
		 
			
			Saving to a word file from Acrobat retains most of the formatting
		 
		
	
		
		
		
		
		
		
		
		
		
		
	
	 | 
| 
		 | 
	
	
	
		
		
		
		
			 
		
		
		
		
		
		
		
			
		
		
		
	 | 
| 
			
			 | 
		#11 | |
| 
			
			
			
			 Hedge Wizard 
			
			![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 802 
				Karma: 19999999 
				Join Date: May 2011 
				Location: UK/Philippines 
				
				
				Device: Kobo Touch, Nook Simple 
				
				
				 | 
	
	
	
		
		
		
		
		 Quote: 
	
 With regard to italic, I cannot remember if detected it as most of the items I converted contained little or no italic characters. I anly use this method when other methods do not produce a satisfactory result Try it out and see what you think. I have really found the free Nitro PDF reader to be very good and I use it in preference to the Adobe offereings.  | 
|
| 
		 | 
	
	
	
		
		
		
		
			 
		
		
		
		
		
		
		
			
		
		
		
	 | 
| 
			
			 | 
		#12 | 
| 
			
			
			
			 Grand Sorcerer 
			
			![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 28,892 
				Karma: 207182180 
				Join Date: Jan 2010 
				
				
				
				Device: Nexus 7, Kindle Fire HD 
				
				
				 | 
	
	
	
		
		
		
		
		 
			
			I crop with Adobe Acrobat, then export as html and proceed to regex the piss out of it. I've also had decent luck with PDFMasher, but then you need to add images and a lot of formatting back in (and it's still pretty experimental). I think it's always going to be a fairly hands on affair to convert PDF's.
		 
		
	
		
		
		
		
		
		
		
		
		
		
	
	 | 
| 
		 | 
	
	
	
		
		
		
		
			 
		
		
		
		
		
		
		
			
		
		
		
	 | 
![]()  | 
            
        
    
            
  | 
    
			 
			Similar Threads
		 | 
	||||
| Thread | Thread Starter | Forum | Replies | Last Post | 
| best way to convert PDF to ePUB - what do you think? | easyrider | Calibre | 50 | 12-29-2010 01:07 PM | 
| Would it be better if I convert pdf into epub? | fantasyvn | Sony Reader | 7 | 04-15-2010 08:43 AM |