01-25-2018, 09:04 AM | #1516 |
Guru
Posts: 924
Karma: 53902736
Join Date: Jun 2015
Device: multiple
|
Screen shots of what?
I don't know what parts of the conversion attempts you want shots of. If I knew what you thought they might show, I might be able to figure that out. If I try to run ocr from uncustomized k2pdfopt, it can't find the environment variable TESSDATA_PREFIX, and can't use Tesseract. If I run from customized k2pdfopt_copy or k2pdfopt_dx, it can. k2pdf_copy uses -mode copy to avoid unnecessary compression. k2pdf_dx uses -mode copy -dev dx. I have tried -ocr -ocrlang rus in k2pdfopt_copy. I may get a message stating: Initializing OCR for 2 threads .. Tesseract Open Source OCR Engine v3.05.00 [CUBE+] (lang=rus) Reading 443 pages from ... but get no ocr at the end. I am currently comparing my results with K2 with results with Elucidate, but in the long run, I can't combine K2 with Elucidate. Last edited by MarjaE; 01-25-2018 at 09:15 AM. |
01-27-2018, 11:47 AM | #1517 |
Fuzzball, the purple cat
Posts: 1,272
Karma: 11087488
Join Date: Jun 2011
Location: California
Device: iPad
|
Something like I've attached. In one graphic, I can tell what options you are running, what version you are using, if OCR started correctly, what size source pages you are converting, how many OCR words were found, and how much CPU was used. It is very useful.
Please tell me what it says at the end of the conversion (see circled area in my screenshot). Also, would you please convert a relatively small number of pages from your example (maybe 20) and post the converted result that has "no ocr"? |
02-06-2018, 09:54 AM | #1518 |
Junior Member
Posts: 3
Karma: 37928
Join Date: Feb 2018
Device: Kindle Paperwhite & Moto X 2nd gen
|
Break 2-column page in exactly 4 pages
Hi, is there a way to force k2pdfopt to break each of my 2-column pages in exactly four small pages?
I will read mainly on a Kindle Paperwhite, but sometimes I'll read the same output in my Moto X 2014 (1080x1920, 424DPI). I've got a neat conversion, with no margins and using native PDF output, but now I want more text in each page, at the expense of smaller letters. Also, I want the footnotes to be always at the foot of a page. Thanks for any clue. |
02-06-2018, 04:54 PM | #1519 | |
Junior Member
Posts: 3
Karma: 37928
Join Date: Feb 2018
Device: Kindle Paperwhite & Moto X 2nd gen
|
Quote:
- Conversion mode 2-column. Set Native PDF output. - Set one crop area around each column. - Set the Device as Kindle Voyage. - Set -bp m in additional options. - Set Width to 758 px, the same as Kindle Paperwhite - Set DPI to 300 (I guess this is not important) - Tweak the Height looking at the bottom of the even pages. It should occupy the whole page without throwing any line to the next page. In my case 1220 was a good value. Any easier option? |
|
02-06-2018, 09:48 PM | #1520 | |
Fuzzball, the purple cat
Posts: 1,272
Karma: 11087488
Join Date: Jun 2011
Location: California
Device: iPad
|
Quote:
k2pdfopt -grid 2x2x5 myfile.pdf This will parse myfile.pdf into a 2 x 2 grid, with an output page for each square in the grid. The "5" in the argument above specifies 5% overlap for the grid squares. |
|
02-08-2018, 04:50 PM | #1521 |
Junior Member
Posts: 3
Karma: 37928
Join Date: Feb 2018
Device: Kindle Paperwhite & Moto X 2nd gen
|
Break 2-column page in exactly 4 pages
Excellent, Willus! -grid 2x2x0.2 was nice for me. I had to abandon my original crop areas and replace them with ignore areas. Thanks a lot.
|
02-11-2018, 06:20 AM | #1522 |
Junior Member
Posts: 6
Karma: 42208
Join Date: Feb 2018
Device: android phone
|
Few source code issues:
willus/linux.h Code:
-#include <sys/termios.h> +#include <termios.h> k2pdfopt/k2master.c Code:
#if HAVE_LEPTONICA_LIB wlept_bmp_dewarp(dwbmp,src,srcgrey,white,k2settings->dewarp, k2settings->debug?"k2opt_dewarp_model.pdf":NULL); #endif Code:
- if (k2settings->ocr_max_columns==2 || k2settings->max_columns>1) + if (k2settings->max_columns==2 || k2settings->max_columns>1) |
03-01-2018, 08:55 PM | #1523 |
Junior Member
Posts: 1
Karma: 10
Join Date: Feb 2018
Device: none
|
K2pdfopt is a magic software, very good.
I found a problem with http://willus.com/k2pdfopt/examples/size/bugs.pdf -m 0 -bpc 8 -c -p 2 Use the above parameters Color image brightness reduction and the color of the image is reddened. The details of the image are not clear, especially in the darker parts. How to solve? |
03-04-2018, 07:53 PM | #1524 | |
Fuzzball, the purple cat
Posts: 1,272
Karma: 11087488
Join Date: Jun 2011
Location: California
Device: iPad
|
Quote:
-g 1 -cmax 1 |
|
03-04-2018, 10:16 PM | #1525 |
Junior Member
Posts: 3
Karma: 10
Join Date: Oct 2017
Device: kobo aura 2011 with koreader
|
Hello sir or madam.You have done some great tool out there .
is it possible to convert pdf/djvu and give output djvu or images? |
03-05-2018, 08:25 AM | #1526 |
Fuzzball, the purple cat
Posts: 1,272
Karma: 11087488
Join Date: Jun 2011
Location: California
Device: iPad
|
|
03-12-2018, 08:19 PM | #1527 |
Guru
Posts: 924
Karma: 53902736
Join Date: Jun 2015
Device: multiple
|
I've been experimenting with different ocr tools: the built-in ocr in k2pdfopt, Elucidate, and ocrmypdf.
All these implement Tesseract. But the k2pdfopt version often misses text which the other versions convert. Unfortunately, ocring in either Elucidate, or ocrmypdf; and then converting in either k2pdfopt, or Ghostscript; often leads to an unreadable mess. Is there any way to ocr and convert in k2pdfopt, while getting the ocr quality of the other ones which implement Tesseract? After setting up the tessadata folder, is it just a matter of downloading from tessdata-best, instead of just tessdata? |
03-12-2018, 10:10 PM | #1528 | |
Fuzzball, the purple cat
Posts: 1,272
Karma: 11087488
Join Date: Jun 2011
Location: California
Device: iPad
|
Quote:
Edit: Please run Elucidate or ocrmypdf on the attached document and post the resulting PDF. Last edited by willus; 03-17-2018 at 12:26 PM. Reason: Added corrected attachment |
|
03-13-2018, 10:03 AM | #1529 |
Guru
Posts: 924
Karma: 53902736
Join Date: Jun 2015
Device: multiple
|
Sorry, got mixed up. Had decent results with E+K2 and O+K2, mixed and sometimes terrible results with E+GS and O+GS.
Examples include: E+K2 -mode copy -dev dx and E+GS -sDEVICE=pdfwrite -dCompatibilityLevel=1.4 -sstdout=%sstderr -dPDFSETTINGS=/screen -dNOPAUSE -dQUIET -dBATCH -sOutputFile="$outputfile" "$f" where "$f" is the input filename. |
03-13-2018, 08:33 PM | #1530 | |
Fuzzball, the purple cat
Posts: 1,272
Karma: 11087488
Join Date: Jun 2011
Location: California
Device: iPad
|
Quote:
|
|
Tags |
ebook apps, k5 tools, kindle tools, kindle touch, tools |
Thread Tools | Search this Thread |
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
Viewing PDFs with another font | Font | PocketBook | 4 | 11-12-2010 08:27 AM |
Viewing Textbook PDFs... | NJReader | enTourage Archive | 4 | 08-17-2010 05:17 PM |
PRS-600 Restart bug while viewing PDFs? | conundrum | Sony Reader | 2 | 03-04-2010 08:46 PM |
More on viewing pdfs | dso371 | Bookeen | 8 | 03-11-2008 07:15 PM |
Viewing Untagged PDFs on Palm T|X | Eroica | Reading and Management | 3 | 12-10-2007 01:44 PM |