Register Guidelines E-Books Search Today's Posts Mark Forums Read

Go Back   MobileRead Forums > E-Book Formats > PDF

Notices

Reply
 
Thread Tools Search this Thread
Old 02-11-2017, 12:14 AM   #16
MarjaE
Guru
MarjaE ought to be getting tired of karma fortunes by now.MarjaE ought to be getting tired of karma fortunes by now.MarjaE ought to be getting tired of karma fortunes by now.MarjaE ought to be getting tired of karma fortunes by now.MarjaE ought to be getting tired of karma fortunes by now.MarjaE ought to be getting tired of karma fortunes by now.MarjaE ought to be getting tired of karma fortunes by now.MarjaE ought to be getting tired of karma fortunes by now.MarjaE ought to be getting tired of karma fortunes by now.MarjaE ought to be getting tired of karma fortunes by now.MarjaE ought to be getting tired of karma fortunes by now.
 
Posts: 924
Karma: 53902736
Join Date: Jun 2015
Device: multiple
Thanks.

I had been struggling with a Google Books pdf - I couldn't find a readable non-Google Books pdf of the same work - but Aiseesoft couldn't handle all the tables. I guess I'm still going to need to use Elucidate whenever I have to use Google Books.
MarjaE is offline   Reply With Quote
Old 02-22-2017, 07:08 AM   #17
helixpteron
Junior Member
helixpteron began at the beginning.
 
Posts: 9
Karma: 10
Join Date: Dec 2016
Device: Kobo Aura One
Quote:
Originally Posted by MarjaE View Post
Elucidate
...

I haven't found much on the net about it - is it an elearning software ? Thanks.
helixpteron is offline   Reply With Quote
Advert
Old 02-23-2017, 08:53 AM   #18
willus
Fuzzball, the purple cat
willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.
 
willus's Avatar
 
Posts: 1,272
Karma: 11087488
Join Date: Jun 2011
Location: California
Device: iPad
Quote:
Originally Posted by helixpteron View Post
...

I haven't found much on the net about [Elucidate] - is it an elearning software ? Thanks.
Elucidate is a Mac app that provides optical character recognition (OCR) for scanned PDFs using the Tesseract OCR engine.

https://itunes.apple.com/us/app/elucidate/id1066088407

Last edited by willus; 02-23-2017 at 08:56 AM.
willus is offline   Reply With Quote
Old 04-02-2017, 02:00 AM   #19
MarjaE
Guru
MarjaE ought to be getting tired of karma fortunes by now.MarjaE ought to be getting tired of karma fortunes by now.MarjaE ought to be getting tired of karma fortunes by now.MarjaE ought to be getting tired of karma fortunes by now.MarjaE ought to be getting tired of karma fortunes by now.MarjaE ought to be getting tired of karma fortunes by now.MarjaE ought to be getting tired of karma fortunes by now.MarjaE ought to be getting tired of karma fortunes by now.MarjaE ought to be getting tired of karma fortunes by now.MarjaE ought to be getting tired of karma fortunes by now.MarjaE ought to be getting tired of karma fortunes by now.
 
Posts: 924
Karma: 53902736
Join Date: Jun 2015
Device: multiple
... Another few tries.

* If you have a good text layer, and you want to extract that layer, most Macos Sierra applications won't handle ligatures and will substitute blank spaces for ff and fi ligatures and probably others. I understand it may not handle superscript either.

* If you don't have a good text layer, you will need ocr to create one, before you can extract that layer. Tesseract e.g. Elucidate is good for short passages, but do you want to correct errors across an entire ocred book? Abbyy Finereader might work better.

* Sometimes ocr merges columns in 2-column or 3-column view. Sometimes ocr separates columns in tables. The more it avoids one error, the more it's likely to run into the other. Processing before ocr makes text recognition errors more likely, but processing with k2pdfopt might make column recognition errors less likely. I haven't tested this fix.

* If the original format isn't important, and if the ligature bug gets fixed, then extracting the text and manually re-inserting pictures and tables may be a workable fix. I haven't gotten this working yet though.

* In the case of Internet Archive texts, there're usually epub and/or txt versions as well as the pdf version.
MarjaE is offline   Reply With Quote
Old 04-02-2017, 03:04 PM   #20
MarjaE
Guru
MarjaE ought to be getting tired of karma fortunes by now.MarjaE ought to be getting tired of karma fortunes by now.MarjaE ought to be getting tired of karma fortunes by now.MarjaE ought to be getting tired of karma fortunes by now.MarjaE ought to be getting tired of karma fortunes by now.MarjaE ought to be getting tired of karma fortunes by now.MarjaE ought to be getting tired of karma fortunes by now.MarjaE ought to be getting tired of karma fortunes by now.MarjaE ought to be getting tired of karma fortunes by now.MarjaE ought to be getting tired of karma fortunes by now.MarjaE ought to be getting tired of karma fortunes by now.
 
Posts: 924
Karma: 53902736
Join Date: Jun 2015
Device: multiple
* Fix for the ligature issue: I tried to slice off a couple pages, with a lot of ligatures, as a sample file. ... It fixed the ligatures there. I wouldn't want to overwrite the originals of a lot of my pdfs, but I can keep an extra copy which has been split. Or I can extract the text from that copy.

* Sometimes source pdfs have out-of-order text layers. I have no idea how to fix these.

Last edited by MarjaE; 04-02-2017 at 04:08 PM.
MarjaE is offline   Reply With Quote
Advert
Old 04-11-2017, 12:16 PM   #21
HarryT
eBook Enthusiast
HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.
 
HarryT's Avatar
 
Posts: 85,544
Karma: 93383043
Join Date: Nov 2006
Location: UK
Device: Kindle Oasis 2, iPad Pro 10.5", iPhone 6
Quote:
Originally Posted by MarjaE View Post
Sometimes source pdfs have out-of-order text layers. I have no idea how to fix these.
The most reliable way to convert PDF to reflowable formats is (in my experience, at least) to use a decent OCR program like Abbyy FineReader. You can pick up older versions of this for a moderate price.
HarryT is offline   Reply With Quote
Old 05-19-2017, 05:51 AM   #22
Janet16
Enthusiast
Janet16 ought to be getting tired of karma fortunes by now.Janet16 ought to be getting tired of karma fortunes by now.Janet16 ought to be getting tired of karma fortunes by now.Janet16 ought to be getting tired of karma fortunes by now.Janet16 ought to be getting tired of karma fortunes by now.Janet16 ought to be getting tired of karma fortunes by now.Janet16 ought to be getting tired of karma fortunes by now.Janet16 ought to be getting tired of karma fortunes by now.Janet16 ought to be getting tired of karma fortunes by now.Janet16 ought to be getting tired of karma fortunes by now.Janet16 ought to be getting tired of karma fortunes by now.
 
Janet16's Avatar
 
Posts: 41
Karma: 2621116
Join Date: Jul 2011
Device: iPad
Yes, decent OCR program Abbyy FineReader, but it's a little expensive.
I also have tried Prizmo and Cisdem PDFConverterOCR, I found if only want to convert scanned PDF into Epub(or other editable document) PDFConverterOCR is a good choice, it is available on stacksocial deal.
Prizmo do good jobs on extract text and do editing from the PDF file.
Janet16 is offline   Reply With Quote
Old 05-20-2017, 06:42 AM   #23
HarryT
eBook Enthusiast
HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.
 
HarryT's Avatar
 
Posts: 85,544
Karma: 93383043
Join Date: Nov 2006
Location: UK
Device: Kindle Oasis 2, iPad Pro 10.5", iPhone 6
Quote:
Originally Posted by Janet16 View Post
Yes, decent OCR program Abbyy FineReader, but it's a little expensive.
The current version is extremely expensive, but you can buy older versions much more cheaply, and you really don't need the latest version for this task.
HarryT is offline   Reply With Quote
Reply

Tags
pdf to epub

Thread Tools Search this Thread
Search this Thread:

Advanced Search

Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Convert PDFs into readable EPUBs skinnymojo Conversion 3 01-23-2012 03:06 PM
Whats the best reader for ePubs and PDFs? BIG45-70 Which one should I buy? 3 07-28-2010 01:35 PM
Calibre 0.6.14 with Mac OSX 10.6.1: didn't convert any PDFs MarcJLH Calibre 9 10-02-2009 11:35 PM
RELEASED: Native transcoding of PDFs and epubs on the Kindle2 jesse Kindle Developer's Corner 23 05-27-2009 11:19 AM
Convert print-protected pdfs into image-based pdfs? magogo Sony Reader 3 12-04-2007 01:18 AM


All times are GMT -4. The time now is 12:53 AM.


MobileRead.com is a privately owned, operated and funded community.