|  12-24-2016, 10:00 PM | #1 | 
| Guru            Posts: 942 Karma: 53902736 Join Date: Jun 2015 Device: multiple | 
				
				Good way to convert pdfs to epubs on the mac?
			 
			
			Most of these are scanned pdfs. Some pdfs can freeze Preview, Skim, or either e-reader. And generally pdfs are much harder on the Kindle Dx than mobis are. Some of these have text layers, some don't. Extracting the text layers, removing the line-braking hyphens, and resolving the misspellings could help. | 
|   |   | 
|  12-24-2016, 11:04 PM | #2 | 
| Enthusiast            Posts: 33 Karma: 26718 Join Date: Nov 2013 Location: Long Island, NY - USA Device: Oasis | 
			
			Off the top of my head, I'd use https://smallpdf.com/ to convert the PDF to Word and then let Calibre do the conversion to whatever eBook format you like. The only consistent problem I've encountered is the conversion adds about one extra space between words every two pages or so (which is easily fixable automagically a number of ways.
		 | 
|   |   | 
|  12-25-2016, 11:19 AM | #3 | 
| Guru            Posts: 942 Karma: 53902736 Join Date: Jun 2015 Device: multiple | 
			
			I tried https://smallpdf.com/ with 3 pdfs. The first 2 times it answered "This file does not seem to be a PDF." The last time it answered "Sorry your upload failed. Please try again," twice. P.S. I tried the help page. It flashed, and now I have a migraine. And no, I can't use the web without strobe-blocking and animation-blocking extensions. Last edited by MarjaE; 12-25-2016 at 11:28 AM. | 
|   |   | 
|  12-25-2016, 03:03 PM | #4 | |
| Enthusiast            Posts: 48 Karma: 854254 Join Date: Nov 2016 Device: none | Quote: 
 I don't think you'll find a magic wand that'll guess your intended goal. What you should be looking at it's far a workflow, a combination of tools. Since you know your material and where you want to get at after trying several things and converting and adjusting a few of the pdfs you'll realize which tools are good for your needs. There no shortage of tools edit and manipulate pdfs so I give you a few off the top of my head: LibreOffice imagemagick gimp different pdf readers and their convertig options. What you have to do is google 'pdf' + "program" + "approximate goal" and you'll get a lot good results. At the begining it might sound like a inconveninet thing to do but after you see the results you'll get the hang of it. Lastly, Safari's PDF viewer got a pretty darn good OCR when highlighting pds. peace | |
|   |   | 
|  12-25-2016, 05:12 PM | #5 | 
| Guru            Posts: 942 Karma: 53902736 Join Date: Jun 2015 Device: multiple | 
			
			Okay. Some of these have their own imperfect text layers. Splitting or compressing the documents often results in losing the text layers. (I use pdf toolkit+) Some of these come from the Internet Archive and have ocr'd text versions. The big problems are that the ocr can screw up tables, can misread figures, and of course, can misread ordinary words. So I've needed either pdf or djvu for comparison. Some don't have text versions. If I can extract the text layer, then spell-checkers could help with the minor errors, the substitution of punctuation for letters, etc., in English-language docs. Not so useful with the major errors. (I would prefer NeoOffice to LibreOffice for this, but neither can find and replace hyphen-breaks or extra line breaks, so I'd probably need Calibre's editing tools too.) If I can find, excerpt, and re-compress the relevant tables, I could perhaps use two versions, one a pdf with the tables, and the other an epub or mobi with the text. (I would keep using pdf toolkit+) | 
|   |   | 
|  12-25-2016, 06:46 PM | #6 | 
| Fuzzball, the purple cat            Posts: 1,312 Karma: 11087488 Join Date: Jun 2011 Location: California Device: iPad | |
|   |   | 
|  12-26-2016, 12:36 PM | #7 | 
| Guru            Posts: 942 Karma: 53902736 Join Date: Jun 2015 Device: multiple | 
			
			Okay, thanks. I really need some way to extract existing text layers for entire books. I know "it isn't hard" but I don't know how to do it. I would also like some way to speed up proofreading, and extract the pages with tables and insert them at the appropriate points in the text. I don't have Word, but I imported into NeoOffice and got a long ocr'd drawing of one text. I think the ocr was the Internet Archive's text layer, but I don't know for sure. I can see that some figures are off - 4 for 1, 0 for 6, etc. I don't know how to strip off the source images or convert all the small text boxes into proper tables... And with my disabilities, I haven't found an accessible tablet, and I never expect to. | 
|   |   | 
|  12-27-2016, 12:38 PM | #8 | 
| Grand Sorcerer            Posts: 11,470 Karma: 13095790 Join Date: Aug 2007 Location: Grass Valley, CA Device: EB 1150, EZ Reader, Literati, iPad 2 & Air 2, iPhone 7 | 
			
			To extract the text use Adobe reader and save as text from the file menu.
		 | 
|   |   | 
|  12-28-2016, 11:23 AM | #9 | 
| Fuzzball, the purple cat            Posts: 1,312 Karma: 11087488 Join Date: Jun 2011 Location: California Device: iPad | |
|   |   | 
|  12-28-2016, 06:13 PM | #10 | 
| Guru            Posts: 942 Karma: 53902736 Join Date: Jun 2015 Device: multiple | 
			
			I don't have Adobe Reader. I have a nasty strobe sensitivity. Adobe's site has hit me with strobes. I use a number of Firefox accessibility fixes, but they didn't block these strobes. I have Adobe Digital Editions from years ago, but I avoid that site now. Neither Skim nor Preview allow me to save pdfs as text. I never figured out how to install k2pdf. Or exactly what I could do with it. I have Elucidate to create a text layer. | 
|   |   | 
|  12-30-2016, 12:11 AM | #11 | 
| Fuzzball, the purple cat            Posts: 1,312 Karma: 11087488 Join Date: Jun 2011 Location: California Device: iPad | |
|   |   | 
|  01-07-2017, 04:35 AM | #12 | 
| Junior Member  Posts: 9 Karma: 10 Join Date: Dec 2016 Device: Kobo Aura One | 
			
			I only have a history of two weeks trying to convert: I tried calibre, Acrobat Pro with Calibre, Wondershare PDF converter and Aiseesoft Converter. The only software I found to do this reliably was the Aiseesoft PDF Converter and I stopped looking for another solution, as this simply did the trick for what I needed. There is a pdf to epub only version, which ist cheaper. In theory you should be able to set the autoimport folder of calibre as the output folder of Aiseesoft, but unfortunately it doesn't set author and title right - so I export into source folder and then add the format to calibre manually. It's a bit inconvenient, but the result is worth it. Or - you can autoadd it - mark the old and the new book and then use command-shift-M to merge the two book in the first selected one - this requires the least input. Therefor this isn't a solution for bulk conversion, but I only use it when calibre doesn't yield a decent result and then it ist brilliant. I tried to achieve the same with Acrobat Pro, but it was inferior - by far. I reconverted books that came out unusable with everything else I tried and it worked fine. https://www.aiseesoft.de/pdf-to-epub-converter/ PS: For the cover images to show up in ibooks I had to do a epub to epub conversion with output profile table in calibre, but that might not be a conern for you. Hope this helps, Rob. Last edited by helixpteron; 01-07-2017 at 04:47 AM. | 
|   |   | 
|  01-08-2017, 11:42 AM | #13 | |
| Fuzzball, the purple cat            Posts: 1,312 Karma: 11087488 Join Date: Jun 2011 Location: California Device: iPad | Quote: 
 | |
|   |   | 
|  01-13-2017, 11:52 AM | #14 | 
| Enthusiast            Posts: 48 Karma: 854254 Join Date: Nov 2016 Device: none | 
			
			hallo, if the pdf is text based "pdftohtml" gets it right but with css/html monstrosity which can be taken care afterwards. There's a more sophisticated tool which I haven't tried 'pdf2htmlEX'. This one get complex scientific pdf rendering into proper mathml and some other nifty proper html/css/whatever format. If it's just text 'pdftohtml' gets fontsize, bold, italics, quotes etc, correct. As I said earlier, is about trying to see what fits someones particular case. Here calibre faired poorly with a pdf to epub directly. Sigil, as a tool, is quite faster during post pdf editing/clean up. | 
|   |   | 
|  01-19-2017, 07:21 AM | #15 | 
| Junior Member  Posts: 9 Karma: 10 Join Date: Dec 2016 Device: Kobo Aura One | 
			
			I must admit, that I only tried it with PDFs that were also in part comlex, but never with a multi column one. Thanks for the hint with WORD - I will now use Aiseesoft for books and word for papers. Cheers, Rob. | 
|   |   | 
|  | 
| Tags | 
| pdf to epub | 
| 
 | 
|  Similar Threads | ||||
| Thread | Thread Starter | Forum | Replies | Last Post | 
| Convert PDFs into readable EPUBs | skinnymojo | Conversion | 3 | 01-23-2012 03:06 PM | 
| Whats the best reader for ePubs and PDFs? | BIG45-70 | Which one should I buy? | 3 | 07-28-2010 01:35 PM | 
| Calibre 0.6.14 with Mac OSX 10.6.1: didn't convert any PDFs | MarcJLH | Calibre | 9 | 10-02-2009 11:35 PM | 
| RELEASED: Native transcoding of PDFs and epubs on the Kindle2 | jesse | Kindle Developer's Corner | 23 | 05-27-2009 11:19 AM | 
| Convert print-protected pdfs into image-based pdfs? | magogo | Sony Reader | 3 | 12-04-2007 01:18 AM |