![]() |
#1801 |
Junior Member
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 6
Karma: 42956
Join Date: Jun 2020
Device: none
|
Gaps regarding of source resolution
I have two scanned PDFs (the same edition, design, fonts, etc.) created from pictures. One is created from 750 x 1182 source pictures, another one from 650 x 1025 source pictures. Converted PDF from the 750x1182 source has a lot of "gaps", the conversion of 650x1025 is almost excellent. When 750x1182 PDF is exported to images (using PDF XChange Viewer), these pictures resized to 650 x 1025, created a new PDF, and then converted, all is fine.
The target device is Huawei P20 lite mobile phone, with resolution 1080x2110 on 150 dpi (it is a visible PDF area in PocketBook reader application), additional options -om 0.2 -y -de 3.0 -gtc 0.2 (the problem exists without these options). I can on some file-sharing site post the samples of source and resulting PDFs, and the screenshot of settings as well. |
![]() |
![]() |
![]() |
#1802 |
Fuzzball, the purple cat
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 1,302
Karma: 11087488
Join Date: Jun 2011
Location: California
Device: iPad
|
If you could post example pages from the two different source PDF files, that would help the most. It's hard to know what you mean by "a lot of gaps" vs. "almost excellent" without some visuals.
|
![]() |
![]() |
![]() |
#1803 |
Junior Member
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 6
Karma: 42956
Join Date: Jun 2020
Device: none
|
I'll upload them later. "A lot of gaps" means something like below:
650 x 1025 XXXXXXXXXX XXXXXXXXXX XXXXXXXXXX XXXXXXXXXX XXXXXXXXXX (the result is one page) 750x1182 XXXXXXXXXX XXXXXXXXXX XXXXXXXXXX XXXXXXXXXX XXXXXXXXXX (the result are two pages) In most cases, the problem occurs when on the source is the end of one page and the beginning of the next page. Also, somewhere exists a problem like this: XXXXXXXXXX XXXXXXXXXX XXXXXXXXXX XXXXXXXXXX XXXXXXXXXX (P.S. the forum editor changes my "examples", the page with one row is the first one) Last edited by wiz011; 06-30-2020 at 10:26 AM. Reason: Explanation |
![]() |
![]() |
![]() |
#1804 |
Fuzzball, the purple cat
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 1,302
Karma: 11087488
Join Date: Jun 2011
Location: California
Device: iPad
|
Please attach sample pages from the source PDF files. You can even use k2pdfopt to create the sample pages:
k2pdfopt -mode copy -n source.pdf -p 10-15 -o pages_10_to_15.pdf |
![]() |
![]() |
![]() |
#1805 |
Junior Member
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 6
Karma: 42956
Join Date: Jun 2020
Device: none
|
I apologize for the delay, I was rather busy during past days. Here are the samples and the screenshot of settings.
http://www.mediafire.com/file/5rs7c7...mples.zip/file |
![]() |
![]() |
![]() |
#1806 |
Junior Member
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 6
Karma: 42956
Join Date: Jun 2020
Device: none
|
A rather weird thing is that the conversion from 5 pages sample included in samples.zip is excellent. So, here is the whole original source, which caused the problems.
https://www.mediafire.com/file/pquji...ource.zip/file Maybe I should try with PDFs without cover and other parts, only two columns text? |
![]() |
![]() |
![]() |
#1807 |
Fuzzball, the purple cat
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 1,302
Karma: 11087488
Join Date: Jun 2011
Location: California
Device: iPad
|
The issue is that the 750x1182 version has bookmarks / a table of contents (toc) entry for every page (see attached). By default, k2pdfopt starts a new conversion page for each new page in the table of contents. To disable this, add -bp-- in the "additional options". I'm guessing your 650x1025 source file did not have bookmarks / toc.
This is why it is so important to post the source file. I could not have easily diagnosed this without seeing your source file. |
![]() |
![]() |
![]() |
#1808 |
Junior Member
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 6
Karma: 42956
Join Date: Jun 2020
Device: none
|
LOL, it's funny how my assumption related to resolution was entirely wrong. Thanks for your explanation.
I have one more question related to -gtr settings, but, again, I must first prepare the examples. |
![]() |
![]() |
![]() |
#1809 |
Fuzzball, the purple cat
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 1,302
Karma: 11087488
Join Date: Jun 2011
Location: California
Device: iPad
|
k2pdfopt v2.53 released
K2pdfopt v2.53 is released.
This version improves OCR multithreading, adds better DJVU support (text layer extraction), adds CBZ support, and is compiled with the latest third party libraries e.g. Tesseract 4.1.1. See details at the web site. |
![]() |
![]() |
![]() |
#1810 |
Junior Member
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 3
Karma: 42956
Join Date: Mar 2019
Device: Kindle 3 Keyboard
|
Hello Willus. I have a problem with one pdf file. Could you please help me optimizing this file for kindle paperwhite 4? thanks in advance!
|
![]() |
![]() |
![]() |
#1811 |
Fuzzball, the purple cat
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 1,302
Karma: 11087488
Join Date: Jun 2011
Location: California
Device: iPad
|
I found the best method was to first split into side-by-side pages and then to convert that result with autostraighten and OCR.
Code:
k2pdfopt book.pdf -mode crop -dpi 50 -fc- -cbox1 0.36in,0.75in,3.22in,4.32in -cbox1 5.35in,0.1in,5.07in,8.36in -cbox3 0.41in,3.79in,2.56in,3.08in -cbox4- 0.66in,0in,4.28in,8.73in -cbox4- 5.41in,0in,4.67in,8.73in -o booksplit.pdf k2pdfopt booksplit.pdf -p 2-,1 -as -mode fw -ls- -n- -w 3.39in -h 4.55in -dpi 165 -de 1.5 -ocr t -ocrlang ell -o final.pdf |
![]() |
![]() |
![]() |
#1812 |
Junior Member
![]() Posts: 2
Karma: 10
Join Date: Aug 2020
Device: Pocketbook touch lux 3
|
Question
Nice tool for pdf docs, thanks! Anyway I have question, is it possible to remove empty lines without text from pdf and make it look like one block of text in ereader screen? For example
I saw a bird. Bird was flying. I mean remove an empty line between those two lines. Thanks. |
![]() |
![]() |
![]() |
#1813 | |
Fuzzball, the purple cat
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 1,302
Karma: 11087488
Join Date: Jun 2011
Location: California
Device: iPad
|
Quote:
k2pdfopt -vs 0.1 question.pdf Last edited by willus; 08-30-2020 at 07:57 AM. |
|
![]() |
![]() |
![]() |
#1814 |
Guru
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 937
Karma: 53902736
Join Date: Jun 2015
Device: multiple
|
Hi,
Is there a way to pipe the output from another script into k2pdfopt? Or pipe the output into another script? |
![]() |
![]() |
![]() |
#1815 |
Fuzzball, the purple cat
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 1,302
Karma: 11087488
Join Date: Jun 2011
Location: California
Device: iPad
|
Not presently. You'll have to go through files rather than through stdin/stdout.
|
![]() |
![]() |
![]() |
Tags |
ebook apps, k5 tools, kindle tools, kindle touch, tools |
|
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
Viewing PDFs with another font | Font | PocketBook | 4 | 11-12-2010 08:27 AM |
Viewing Textbook PDFs... | NJReader | enTourage Archive | 4 | 08-17-2010 05:17 PM |
PRS-600 Restart bug while viewing PDFs? | conundrum | Sony Reader | 2 | 03-04-2010 08:46 PM |
More on viewing pdfs | dso371 | Bookeen | 8 | 03-11-2008 07:15 PM |
Viewing Untagged PDFs on Palm T|X | Eroica | Reading and Management | 3 | 12-10-2007 01:44 PM |