Register Guidelines E-Books Search Today's Posts Mark Forums Read

Go Back   MobileRead Forums > E-Book Formats > PDF

Notices

Reply
 
Thread Tools Search this Thread
Old 01-25-2018, 10:04 AM   #1516
MarjaE
Addict
MarjaE ought to be getting tired of karma fortunes by now.MarjaE ought to be getting tired of karma fortunes by now.MarjaE ought to be getting tired of karma fortunes by now.MarjaE ought to be getting tired of karma fortunes by now.MarjaE ought to be getting tired of karma fortunes by now.MarjaE ought to be getting tired of karma fortunes by now.MarjaE ought to be getting tired of karma fortunes by now.MarjaE ought to be getting tired of karma fortunes by now.MarjaE ought to be getting tired of karma fortunes by now.MarjaE ought to be getting tired of karma fortunes by now.MarjaE ought to be getting tired of karma fortunes by now.
 
Posts: 329
Karma: 1548692
Join Date: Jun 2015
Device: Iriver Story HD and Amazon Kindle DX
Screen shots of what?

I don't know what parts of the conversion attempts you want shots of. If I knew what you thought they might show, I might be able to figure that out.

If I try to run ocr from uncustomized k2pdfopt, it can't find the environment variable TESSDATA_PREFIX, and can't use Tesseract.

If I run from customized k2pdfopt_copy or k2pdfopt_dx, it can. k2pdf_copy uses -mode copy to avoid unnecessary compression. k2pdf_dx uses -mode copy -dev dx.

I have tried -ocr -ocrlang rus in k2pdfopt_copy. I may get a message stating:

Initializing OCR for 2 threads ..
Tesseract Open Source OCR Engine v3.05.00 [CUBE+] (lang=rus)
Reading 443 pages from

... but get no ocr at the end.

I am currently comparing my results with K2 with results with Elucidate, but in the long run, I can't combine K2 with Elucidate.

Last edited by MarjaE; 01-25-2018 at 10:15 AM.
MarjaE is offline   Reply With Quote
Advert
Old 01-27-2018, 12:47 PM   #1517
willus
Fuzzball, the purple cat
willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.
 
willus's Avatar
 
Posts: 967
Karma: 7562459
Join Date: Jun 2011
Location: California
Device: iPad
Quote:
Originally Posted by MarjaE View Post
Screen shots of what?
Something like I've attached. In one graphic, I can tell what options you are running, what version you are using, if OCR started correctly, what size source pages you are converting, how many OCR words were found, and how much CPU was used. It is very useful.

Quote:
Originally Posted by MarjaE View Post
I have tried -ocr -ocrlang rus in k2pdfopt_copy. I may get a message stating:

Initializing OCR for 2 threads ..
Tesseract Open Source OCR Engine v3.05.00 [CUBE+] (lang=rus)
Reading 443 pages from

... but get no ocr at the end...
Please tell me what it says at the end of the conversion (see circled area in my screenshot). Also, would you please convert a relatively small number of pages from your example (maybe 20) and post the converted result that has "no ocr"?
Attached Thumbnails
Click image for larger version

Name:	screenshot.png
Views:	22
Size:	31.6 KB
ID:	161869  
willus is offline   Reply With Quote
Old 02-06-2018, 10:54 AM   #1518
josorio
Junior Member
josorio can name that ebook in five wordsjosorio can name that ebook in five wordsjosorio can name that ebook in five wordsjosorio can name that ebook in five wordsjosorio can name that ebook in five wordsjosorio can name that ebook in five wordsjosorio can name that ebook in five wordsjosorio can name that ebook in five wordsjosorio can name that ebook in five wordsjosorio can name that ebook in five wordsjosorio can name that ebook in five words
 
Posts: 3
Karma: 37928
Join Date: Feb 2018
Device: Kindle Paperwhite & Moto X 2nd gen
Question Break 2-column page in exactly 4 pages

Hi, is there a way to force k2pdfopt to break each of my 2-column pages in exactly four small pages?
I will read mainly on a Kindle Paperwhite, but sometimes I'll read the same output in my Moto X 2014 (1080x1920, 424DPI). I've got a neat conversion, with no margins and using native PDF output, but now I want more text in each page, at the expense of smaller letters. Also, I want the footnotes to be always at the foot of a page.
Thanks for any clue.
josorio is offline   Reply With Quote
Old 02-06-2018, 05:54 PM   #1519
josorio
Junior Member
josorio can name that ebook in five wordsjosorio can name that ebook in five wordsjosorio can name that ebook in five wordsjosorio can name that ebook in five wordsjosorio can name that ebook in five wordsjosorio can name that ebook in five wordsjosorio can name that ebook in five wordsjosorio can name that ebook in five wordsjosorio can name that ebook in five wordsjosorio can name that ebook in five wordsjosorio can name that ebook in five words
 
Posts: 3
Karma: 37928
Join Date: Feb 2018
Device: Kindle Paperwhite & Moto X 2nd gen
Quote:
Originally Posted by josorio View Post
Hi, is there a way to force k2pdfopt to break each of my 2-column pages in exactly four small pages?
I will read mainly on a Kindle Paperwhite, but sometimes I'll read the same output in my Moto X 2014 (1080x1920, 424DPI). I've got a neat conversion, with no margins and using native PDF output, but now I want more text in each page, at the expense of smaller letters. Also, I want the footnotes to be always at the foot of a page.
Thanks for any clue.
I found a way:
- Conversion mode 2-column. Set Native PDF output.
- Set one crop area around each column.
- Set the Device as Kindle Voyage.
- Set -bp m in additional options.
- Set Width to 758 px, the same as Kindle Paperwhite
- Set DPI to 300 (I guess this is not important)
- Tweak the Height looking at the bottom of the even pages. It should occupy the whole page without throwing any line to the next page. In my case 1220 was a good value.
Any easier option?
josorio is offline   Reply With Quote
Old 02-06-2018, 10:48 PM   #1520
willus
Fuzzball, the purple cat
willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.
 
willus's Avatar
 
Posts: 967
Karma: 7562459
Join Date: Jun 2011
Location: California
Device: iPad
Quote:
Originally Posted by josorio View Post
I found a way:
- Conversion mode 2-column. Set Native PDF output.
- Set one crop area around each column.
- Set the Device as Kindle Voyage.
- Set -bp m in additional options.
- Set Width to 758 px, the same as Kindle Paperwhite
- Set DPI to 300 (I guess this is not important)
- Tweak the Height looking at the bottom of the even pages. It should occupy the whole page without throwing any line to the next page. In my case 1220 was a good value.
Any easier option?
Maybe just use the -grid option:

k2pdfopt -grid 2x2x5 myfile.pdf

This will parse myfile.pdf into a 2 x 2 grid, with an output page for each square in the grid. The "5" in the argument above specifies 5% overlap for the grid squares.
willus is offline   Reply With Quote
Old 02-08-2018, 05:50 PM   #1521
josorio
Junior Member
josorio can name that ebook in five wordsjosorio can name that ebook in five wordsjosorio can name that ebook in five wordsjosorio can name that ebook in five wordsjosorio can name that ebook in five wordsjosorio can name that ebook in five wordsjosorio can name that ebook in five wordsjosorio can name that ebook in five wordsjosorio can name that ebook in five wordsjosorio can name that ebook in five wordsjosorio can name that ebook in five words
 
Posts: 3
Karma: 37928
Join Date: Feb 2018
Device: Kindle Paperwhite & Moto X 2nd gen
Thumbs up Break 2-column page in exactly 4 pages

Excellent, Willus! -grid 2x2x0.2 was nice for me. I had to abandon my original crop areas and replace them with ignore areas. Thanks a lot.
josorio is offline   Reply With Quote
Old 02-11-2018, 07:20 AM   #1522
axet
Junior Member
axet began at the beginning.
 
Posts: 1
Karma: 10
Join Date: Feb 2018
Device: android phone
Few source code issues:

willus/linux.h

Code:
-#include <sys/termios.h>
+#include <termios.h>
termios unused, can be dropped, not found while compiling for android.

k2pdfopt/k2master.c

Code:
#if HAVE_LEPTONICA_LIB
        wlept_bmp_dewarp(dwbmp,src,srcgrey,white,k2settings->dewarp,
                         k2settings->debug?"k2opt_dewarp_model.pdf":NULL);
#endif
Missing HAVE_LEPTONICA_LIB if/def, not compiling if leptionica is not compiled. Not sure tho, if this is correct replacement, seems like this call should be replaced with equivalent one.

Code:
-    if (k2settings->ocr_max_columns==2 || k2settings->max_columns>1)
+    if (k2settings->max_columns==2 || k2settings->max_columns>1)
missing 'ocr_max_columns' member. replace with 'max_columns'
axet is offline   Reply With Quote
Reply

Tags
ebook apps, k5 tools, kindle tools, kindle touch, tools

Thread Tools Search this Thread
Search this Thread:

Advanced Search

Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Viewing PDFs with another font Font PocketBook 4 11-12-2010 09:27 AM
Viewing Textbook PDFs... NJReader enTourage Archive 4 08-17-2010 06:17 PM
PRS-600 Restart bug while viewing PDFs? conundrum Sony Reader 2 03-04-2010 09:46 PM
More on viewing pdfs dso371 Bookeen 8 03-11-2008 08:15 PM
Viewing Untagged PDFs on Palm T|X Eroica Reading and Management 3 12-10-2007 02:44 PM


All times are GMT -4. The time now is 06:23 AM.


MobileRead.com is a privately owned, operated and funded community.