Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Formats > PDF

Notices

Reply
 
Thread Tools Search this Thread
Old 06-28-2020, 11:19 PM   #1801
wiz011
Junior Member
wiz011 understands when you whisper 'The dog barks at midnight.'wiz011 understands when you whisper 'The dog barks at midnight.'wiz011 understands when you whisper 'The dog barks at midnight.'wiz011 understands when you whisper 'The dog barks at midnight.'wiz011 understands when you whisper 'The dog barks at midnight.'wiz011 understands when you whisper 'The dog barks at midnight.'wiz011 understands when you whisper 'The dog barks at midnight.'wiz011 understands when you whisper 'The dog barks at midnight.'wiz011 understands when you whisper 'The dog barks at midnight.'wiz011 understands when you whisper 'The dog barks at midnight.'wiz011 understands when you whisper 'The dog barks at midnight.'
 
Posts: 6
Karma: 42956
Join Date: Jun 2020
Device: none
Gaps regarding of source resolution

I have two scanned PDFs (the same edition, design, fonts, etc.) created from pictures. One is created from 750 x 1182 source pictures, another one from 650 x 1025 source pictures. Converted PDF from the 750x1182 source has a lot of "gaps", the conversion of 650x1025 is almost excellent. When 750x1182 PDF is exported to images (using PDF XChange Viewer), these pictures resized to 650 x 1025, created a new PDF, and then converted, all is fine.

The target device is Huawei P20 lite mobile phone, with resolution 1080x2110 on 150 dpi (it is a visible PDF area in PocketBook reader application), additional options -om 0.2 -y -de 3.0 -gtc 0.2 (the problem exists without these options).

I can on some file-sharing site post the samples of source and resulting PDFs, and the screenshot of settings as well.
wiz011 is offline   Reply With Quote
Old 06-29-2020, 11:55 PM   #1802
willus
Fuzzball, the purple cat
willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.
 
willus's Avatar
 
Posts: 1,302
Karma: 11087488
Join Date: Jun 2011
Location: California
Device: iPad
If you could post example pages from the two different source PDF files, that would help the most. It's hard to know what you mean by "a lot of gaps" vs. "almost excellent" without some visuals.
willus is offline   Reply With Quote
Old 06-30-2020, 10:23 AM   #1803
wiz011
Junior Member
wiz011 understands when you whisper 'The dog barks at midnight.'wiz011 understands when you whisper 'The dog barks at midnight.'wiz011 understands when you whisper 'The dog barks at midnight.'wiz011 understands when you whisper 'The dog barks at midnight.'wiz011 understands when you whisper 'The dog barks at midnight.'wiz011 understands when you whisper 'The dog barks at midnight.'wiz011 understands when you whisper 'The dog barks at midnight.'wiz011 understands when you whisper 'The dog barks at midnight.'wiz011 understands when you whisper 'The dog barks at midnight.'wiz011 understands when you whisper 'The dog barks at midnight.'wiz011 understands when you whisper 'The dog barks at midnight.'
 
Posts: 6
Karma: 42956
Join Date: Jun 2020
Device: none
I'll upload them later. "A lot of gaps" means something like below:

650 x 1025

XXXXXXXXXX
XXXXXXXXXX
XXXXXXXXXX
XXXXXXXXXX
XXXXXXXXXX

(the result is one page)


750x1182

XXXXXXXXXX XXXXXXXXXX
XXXXXXXXXX
XXXXXXXXXX
XXXXXXXXXX

(the result are two pages)

In most cases, the problem occurs when on the source is the end of one page and the beginning of the next page.

Also, somewhere exists a problem like this:

XXXXXXXXXX
XXXXXXXXXX

XXXXXXXXXX
XXXXXXXXXX
XXXXXXXXXX

(P.S. the forum editor changes my "examples", the page with one row is the first one)

Last edited by wiz011; 06-30-2020 at 10:26 AM. Reason: Explanation
wiz011 is offline   Reply With Quote
Old 06-30-2020, 09:04 PM   #1804
willus
Fuzzball, the purple cat
willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.
 
willus's Avatar
 
Posts: 1,302
Karma: 11087488
Join Date: Jun 2011
Location: California
Device: iPad
Please attach sample pages from the source PDF files. You can even use k2pdfopt to create the sample pages:

k2pdfopt -mode copy -n source.pdf -p 10-15 -o pages_10_to_15.pdf
willus is offline   Reply With Quote
Old 07-04-2020, 07:14 AM   #1805
wiz011
Junior Member
wiz011 understands when you whisper 'The dog barks at midnight.'wiz011 understands when you whisper 'The dog barks at midnight.'wiz011 understands when you whisper 'The dog barks at midnight.'wiz011 understands when you whisper 'The dog barks at midnight.'wiz011 understands when you whisper 'The dog barks at midnight.'wiz011 understands when you whisper 'The dog barks at midnight.'wiz011 understands when you whisper 'The dog barks at midnight.'wiz011 understands when you whisper 'The dog barks at midnight.'wiz011 understands when you whisper 'The dog barks at midnight.'wiz011 understands when you whisper 'The dog barks at midnight.'wiz011 understands when you whisper 'The dog barks at midnight.'
 
Posts: 6
Karma: 42956
Join Date: Jun 2020
Device: none
I apologize for the delay, I was rather busy during past days. Here are the samples and the screenshot of settings.

http://www.mediafire.com/file/5rs7c7...mples.zip/file
wiz011 is offline   Reply With Quote
Old 07-04-2020, 07:32 AM   #1806
wiz011
Junior Member
wiz011 understands when you whisper 'The dog barks at midnight.'wiz011 understands when you whisper 'The dog barks at midnight.'wiz011 understands when you whisper 'The dog barks at midnight.'wiz011 understands when you whisper 'The dog barks at midnight.'wiz011 understands when you whisper 'The dog barks at midnight.'wiz011 understands when you whisper 'The dog barks at midnight.'wiz011 understands when you whisper 'The dog barks at midnight.'wiz011 understands when you whisper 'The dog barks at midnight.'wiz011 understands when you whisper 'The dog barks at midnight.'wiz011 understands when you whisper 'The dog barks at midnight.'wiz011 understands when you whisper 'The dog barks at midnight.'
 
Posts: 6
Karma: 42956
Join Date: Jun 2020
Device: none
A rather weird thing is that the conversion from 5 pages sample included in samples.zip is excellent. So, here is the whole original source, which caused the problems.

https://www.mediafire.com/file/pquji...ource.zip/file

Maybe I should try with PDFs without cover and other parts, only two columns text?
wiz011 is offline   Reply With Quote
Old 07-04-2020, 06:12 PM   #1807
willus
Fuzzball, the purple cat
willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.
 
willus's Avatar
 
Posts: 1,302
Karma: 11087488
Join Date: Jun 2011
Location: California
Device: iPad
The issue is that the 750x1182 version has bookmarks / a table of contents (toc) entry for every page (see attached). By default, k2pdfopt starts a new conversion page for each new page in the table of contents. To disable this, add -bp-- in the "additional options". I'm guessing your 650x1025 source file did not have bookmarks / toc.

This is why it is so important to post the source file. I could not have easily diagnosed this without seeing your source file.
Attached Thumbnails
Click image for larger version

Name:	screenshot.png
Views:	322
Size:	193.7 KB
ID:	180385  
willus is offline   Reply With Quote
Old 07-08-2020, 07:02 AM   #1808
wiz011
Junior Member
wiz011 understands when you whisper 'The dog barks at midnight.'wiz011 understands when you whisper 'The dog barks at midnight.'wiz011 understands when you whisper 'The dog barks at midnight.'wiz011 understands when you whisper 'The dog barks at midnight.'wiz011 understands when you whisper 'The dog barks at midnight.'wiz011 understands when you whisper 'The dog barks at midnight.'wiz011 understands when you whisper 'The dog barks at midnight.'wiz011 understands when you whisper 'The dog barks at midnight.'wiz011 understands when you whisper 'The dog barks at midnight.'wiz011 understands when you whisper 'The dog barks at midnight.'wiz011 understands when you whisper 'The dog barks at midnight.'
 
Posts: 6
Karma: 42956
Join Date: Jun 2020
Device: none
LOL, it's funny how my assumption related to resolution was entirely wrong. Thanks for your explanation.

I have one more question related to -gtr settings, but, again, I must first prepare the examples.
wiz011 is offline   Reply With Quote
Old 07-18-2020, 11:45 AM   #1809
willus
Fuzzball, the purple cat
willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.
 
willus's Avatar
 
Posts: 1,302
Karma: 11087488
Join Date: Jun 2011
Location: California
Device: iPad
k2pdfopt v2.53 released

K2pdfopt v2.53 is released.
This version improves OCR multithreading, adds better DJVU support (text layer extraction), adds CBZ support, and is compiled with the latest third party libraries e.g. Tesseract 4.1.1.
See details at the web site.
willus is offline   Reply With Quote
Old 08-07-2020, 05:53 AM   #1810
vasilas7
Junior Member
vasilas7 understands when you whisper 'The dog barks at midnight.'vasilas7 understands when you whisper 'The dog barks at midnight.'vasilas7 understands when you whisper 'The dog barks at midnight.'vasilas7 understands when you whisper 'The dog barks at midnight.'vasilas7 understands when you whisper 'The dog barks at midnight.'vasilas7 understands when you whisper 'The dog barks at midnight.'vasilas7 understands when you whisper 'The dog barks at midnight.'vasilas7 understands when you whisper 'The dog barks at midnight.'vasilas7 understands when you whisper 'The dog barks at midnight.'vasilas7 understands when you whisper 'The dog barks at midnight.'vasilas7 understands when you whisper 'The dog barks at midnight.'
 
Posts: 3
Karma: 42956
Join Date: Mar 2019
Device: Kindle 3 Keyboard
Hello Willus. I have a problem with one pdf file. Could you please help me optimizing this file for kindle paperwhite 4? thanks in advance!
Attached Files
File Type: pdf docdownloader.com-pdf-.pdf (7.56 MB, 581 views)
vasilas7 is offline   Reply With Quote
Old 08-07-2020, 09:32 AM   #1811
willus
Fuzzball, the purple cat
willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.
 
willus's Avatar
 
Posts: 1,302
Karma: 11087488
Join Date: Jun 2011
Location: California
Device: iPad
I found the best method was to first split into side-by-side pages and then to convert that result with autostraighten and OCR.

Code:
k2pdfopt book.pdf -mode crop -dpi 50 -fc- -cbox1 0.36in,0.75in,3.22in,4.32in -cbox1 5.35in,0.1in,5.07in,8.36in -cbox3 0.41in,3.79in,2.56in,3.08in -cbox4- 0.66in,0in,4.28in,8.73in -cbox4- 5.41in,0in,4.67in,8.73in -o booksplit.pdf

k2pdfopt booksplit.pdf  -p 2-,1 -as -mode fw -ls- -n- -w 3.39in -h 4.55in -dpi 165 -de 1.5 -ocr t -ocrlang ell -o final.pdf
The various -cbox commands crop out certain parts of page 1 and 3 and then all the rest of the pages are split. The second command fits to the page width, straightens, and adds OCR. I've attached the result.
Attached Files
File Type: pdf final.pdf (14.35 MB, 448 views)
willus is offline   Reply With Quote
Old 08-29-2020, 11:53 AM   #1812
GoldenWorm
Junior Member
GoldenWorm began at the beginning.
 
Posts: 2
Karma: 10
Join Date: Aug 2020
Device: Pocketbook touch lux 3
Question

Nice tool for pdf docs, thanks! Anyway I have question, is it possible to remove empty lines without text from pdf and make it look like one block of text in ereader screen? For example
I saw a bird.

Bird was flying.

I mean remove an empty line between those two lines.
Thanks.
GoldenWorm is offline   Reply With Quote
Old 08-30-2020, 07:54 AM   #1813
willus
Fuzzball, the purple cat
willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.
 
willus's Avatar
 
Posts: 1,302
Karma: 11087488
Join Date: Jun 2011
Location: California
Device: iPad
Quote:
Originally Posted by GoldenWorm View Post
Nice tool for pdf docs, thanks! Anyway I have question, is it possible to remove empty lines without text from pdf and make it look like one block of text in ereader screen? For example
I saw a bird.

Bird was flying.

I mean remove an empty line between those two lines.
Thanks.
Yes. Please see the command-line usage page, the -vs and -vb options. E.g. the attached example was processed using:
k2pdfopt -vs 0.1 question.pdf
Attached Files
File Type: pdf question.pdf (45.4 KB, 282 views)
File Type: pdf question_k2opt.pdf (64.8 KB, 287 views)

Last edited by willus; 08-30-2020 at 07:57 AM.
willus is offline   Reply With Quote
Old 09-02-2020, 10:35 PM   #1814
MarjaE
Guru
MarjaE ought to be getting tired of karma fortunes by now.MarjaE ought to be getting tired of karma fortunes by now.MarjaE ought to be getting tired of karma fortunes by now.MarjaE ought to be getting tired of karma fortunes by now.MarjaE ought to be getting tired of karma fortunes by now.MarjaE ought to be getting tired of karma fortunes by now.MarjaE ought to be getting tired of karma fortunes by now.MarjaE ought to be getting tired of karma fortunes by now.MarjaE ought to be getting tired of karma fortunes by now.MarjaE ought to be getting tired of karma fortunes by now.MarjaE ought to be getting tired of karma fortunes by now.
 
Posts: 937
Karma: 53902736
Join Date: Jun 2015
Device: multiple
Hi,

Is there a way to pipe the output from another script into k2pdfopt? Or pipe the output into another script?
MarjaE is offline   Reply With Quote
Old 09-03-2020, 10:11 PM   #1815
willus
Fuzzball, the purple cat
willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.
 
willus's Avatar
 
Posts: 1,302
Karma: 11087488
Join Date: Jun 2011
Location: California
Device: iPad
Not presently. You'll have to go through files rather than through stdin/stdout.
willus is offline   Reply With Quote
Reply

Tags
ebook apps, k5 tools, kindle tools, kindle touch, tools


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Viewing PDFs with another font Font PocketBook 4 11-12-2010 08:27 AM
Viewing Textbook PDFs... NJReader enTourage Archive 4 08-17-2010 05:17 PM
PRS-600 Restart bug while viewing PDFs? conundrum Sony Reader 2 03-04-2010 08:46 PM
More on viewing pdfs dso371 Bookeen 8 03-11-2008 07:15 PM
Viewing Untagged PDFs on Palm T|X Eroica Reading and Management 3 12-10-2007 01:44 PM


All times are GMT -4. The time now is 12:21 PM.


MobileRead.com is a privately owned, operated and funded community.