Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Formats > PDF

Notices

Reply
 
Thread Tools Search this Thread
Old 05-25-2017, 04:18 PM   #1
Memes
Junior Member
Memes knows the complete value of PI to the endMemes knows the complete value of PI to the endMemes knows the complete value of PI to the endMemes knows the complete value of PI to the endMemes knows the complete value of PI to the endMemes knows the complete value of PI to the endMemes knows the complete value of PI to the endMemes knows the complete value of PI to the endMemes knows the complete value of PI to the endMemes knows the complete value of PI to the endMemes knows the complete value of PI to the end
 
Posts: 3
Karma: 31958
Join Date: Apr 2017
Device: Kindle PW3
Convert Image PDF to PDF with text or other ebook format.

Hello,

I have been lurking these forums for a few months now, and I'm proud to say I have finally joined the ranks as someone who reads on an e-ink display!

It's great I'm not wasting my phone battery or hunched over my laptop anymore.

However, most of my books I wish to read are books about the game of go.

What I want to do is take my PDFs and convert them to PDFs with text. (instead of an image on each page its text and an image)

This will make the text available for me to highlight and make notes (I own a kindle paperwhite) and will make it easier for me to format using k2pdfopt. (when I try to reflow a PDF it makes the diagram text below a diagram unreadable because it's so small)

Here is an example of what I'm working with:

http://imgur.com/a/18fPm

Thank you very much for you time and I'm sorry if this question was already answered.

P.S. I don't want to extract only the text I want to keep the whole of the document.

Last edited by Memes; 05-25-2017 at 04:20 PM.
Memes is offline   Reply With Quote
Old 05-26-2017, 09:51 PM   #2
willus
Fuzzball, the purple cat
willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.
 
willus's Avatar
 
Posts: 1,272
Karma: 11087488
Join Date: Jun 2011
Location: California
Device: iPad
Quote:
Originally Posted by Memes View Post
What I want to do is take my PDFs and convert them to PDFs with text. (instead of an image on each page its text and an image)
You can do this with k2pdfopt if you also install the Tesseract OCR portion of it:
Code:
k2pdfopt -mode copy -n- -ocr t -odpi 100 -as go.pdf
See attachments as example. The -as applies auto-straighten since your image is slightly skewed.
Attached Files
File Type: pdf go.pdf (244.7 KB, 454 views)
File Type: pdf go_k2opt.pdf (134.6 KB, 447 views)
willus is offline   Reply With Quote
Old 06-02-2017, 07:31 PM   #3
MarjaE
Guru
MarjaE ought to be getting tired of karma fortunes by now.MarjaE ought to be getting tired of karma fortunes by now.MarjaE ought to be getting tired of karma fortunes by now.MarjaE ought to be getting tired of karma fortunes by now.MarjaE ought to be getting tired of karma fortunes by now.MarjaE ought to be getting tired of karma fortunes by now.MarjaE ought to be getting tired of karma fortunes by now.MarjaE ought to be getting tired of karma fortunes by now.MarjaE ought to be getting tired of karma fortunes by now.MarjaE ought to be getting tired of karma fortunes by now.MarjaE ought to be getting tired of karma fortunes by now.
 
Posts: 924
Karma: 53902736
Join Date: Jun 2015
Device: multiple
I usually use Elucidate. It's a Mac front-end for Tesseract.

However, Tesseract sometimes drops the first letters from words, or reads things in the wrong order. I can't select an entry from a table of contents, to check it in my translation software, because it selects the next page, and inserts that into the entry...
MarjaE is offline   Reply With Quote
Old 06-09-2017, 03:27 PM   #4
Memes
Junior Member
Memes knows the complete value of PI to the endMemes knows the complete value of PI to the endMemes knows the complete value of PI to the endMemes knows the complete value of PI to the endMemes knows the complete value of PI to the endMemes knows the complete value of PI to the endMemes knows the complete value of PI to the endMemes knows the complete value of PI to the endMemes knows the complete value of PI to the endMemes knows the complete value of PI to the endMemes knows the complete value of PI to the end
 
Posts: 3
Karma: 31958
Join Date: Apr 2017
Device: Kindle PW3
Quote:
Originally Posted by willus View Post
You can do this with k2pdfopt if you also install the Tesseract OCR portion of it:
Code:
k2pdfopt -mode copy -n- -ocr t -odpi 100 -as go.pdf
See attachments as example. The -as applies auto-straighten since your image is slightly skewed.
Is the code you posted to be run from a command line in Linux? I am currently running Windows.
Memes is offline   Reply With Quote
Old 06-10-2017, 03:01 PM   #5
DaleDe
Grand Sorcerer
DaleDe ought to be getting tired of karma fortunes by now.DaleDe ought to be getting tired of karma fortunes by now.DaleDe ought to be getting tired of karma fortunes by now.DaleDe ought to be getting tired of karma fortunes by now.DaleDe ought to be getting tired of karma fortunes by now.DaleDe ought to be getting tired of karma fortunes by now.DaleDe ought to be getting tired of karma fortunes by now.DaleDe ought to be getting tired of karma fortunes by now.DaleDe ought to be getting tired of karma fortunes by now.DaleDe ought to be getting tired of karma fortunes by now.DaleDe ought to be getting tired of karma fortunes by now.
 
DaleDe's Avatar
 
Posts: 11,470
Karma: 13095790
Join Date: Aug 2007
Location: Grass Valley, CA
Device: EB 1150, EZ Reader, Literati, iPad 2 & Air 2, iPhone 7
Quote:
Originally Posted by Memes View Post
Is the code you posted to be run from a command line in Linux? I am currently running Windows.
Open a CMD window and use it to type in the command. You could also make a .bat file with the code using a text editor and then tap the bat file.

Dale
DaleDe is offline   Reply With Quote
Old 06-10-2017, 11:47 PM   #6
willus
Fuzzball, the purple cat
willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.
 
willus's Avatar
 
Posts: 1,272
Karma: 11087488
Join Date: Jun 2011
Location: California
Device: iPad
Or you can use the MS Windows GUI:

1. Under conversion mode, select "copy"
2. Check the autostraighten box
3. Check the OCR box
4. Uncheck the native output box
5. Set the output DPI as desired

if you do all this, the auto-generated command-line options should mostly look like what I put in my previous post.
willus is offline   Reply With Quote
Old 06-16-2017, 03:29 AM   #7
Memes
Junior Member
Memes knows the complete value of PI to the endMemes knows the complete value of PI to the endMemes knows the complete value of PI to the endMemes knows the complete value of PI to the endMemes knows the complete value of PI to the endMemes knows the complete value of PI to the endMemes knows the complete value of PI to the endMemes knows the complete value of PI to the endMemes knows the complete value of PI to the endMemes knows the complete value of PI to the endMemes knows the complete value of PI to the end
 
Posts: 3
Karma: 31958
Join Date: Apr 2017
Device: Kindle PW3
Thank you everyone very much for your help and your time I really appreciate it!
Memes is offline   Reply With Quote
Old 05-01-2023, 04:52 PM   #8
unlocked2412
Junior Member
unlocked2412 began at the beginning.
 
Posts: 1
Karma: 10
Join Date: May 2023
Device: Kindle Paperwhite 11
Thank you willus

Quote:
Originally Posted by willus View Post
You can do this with k2pdfopt if you also install the Tesseract OCR portion of it:
Code:
k2pdfopt -mode copy -n- -ocr t -odpi 100 -as go.pdf
See attachments as example. The -as applies auto-straighten since your image is slightly skewed.
Thank you @willus for this wonderful tool !!
unlocked2412 is offline   Reply With Quote
Reply


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Convert epub to pdf, with notes with main text in the pdf? 8140david ePub 1 06-18-2015 01:13 PM
Convert epub to pdf, with notes with main text in the pdf? 8140david Conversion 1 06-18-2015 11:02 AM
PDF ebook convert to Kindle format - with a lot empty lines subarux Calibre 4 12-28-2010 09:53 PM
Convert PDF To Sony eBook Format? Sjwdavies Sony Reader 12 12-13-2009 03:15 AM


All times are GMT -4. The time now is 07:32 PM.


MobileRead.com is a privately owned, operated and funded community.