Register Guidelines E-Books Search Today's Posts Mark Forums Read

Go Back   MobileRead Forums > E-Book Formats > PDF

Notices

Reply
 
Thread Tools Search this Thread
Old 05-05-2020, 10:25 PM   #1786
willus
Fuzzball, the purple cat
willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.
 
willus's Avatar
 
Posts: 1,138
Karma: 8561592
Join Date: Jun 2011
Location: California
Device: iPad
Quote:
Originally Posted by Shohreh View Post
Thanks for the offer: https://we.tl/t-KA9InMaCCd

It looks much better. I just need to leave a bit more on the right, before merging the pages back into a single PDF.

k2pdfopt input.pdf -m 0.8 -mode fw -ls- -o output.pdf -p 30-40,70-75

--
Edit: While we're at it: Can k2pdfopt crop such and such page, and then turn them into pictures in place, without having to do this myself?

ie.
1. Use k2 to crop margins and rescale text
2. Export those pages
3. Convert to PNG
4. Convert to PDF
5. Merge all pages into hybrid PDF (text + pics)

Or maybe k2 can massage the PDF in such as way that there's simply no need to turn problematic pages into pictures (eg. tables, insets, etc.)?
I'm not really sure what you're asking above. I think the point of using k2pdfopt is to avoid exactly those steps. k2pdfopt internally converts every page to a bitmap for a graphical analysis. It can then save a bitmap to the output file or use its internal bitmap to figure out how to insert cropping instructions into the native PDF file (which is what -mode fw does).

Also, I don't think you need to use the -m option since your PDF is very clean (not scanned). k2pdfopt will automatically trim away the margins in the fit-width mode (-fw). What is wrong with the output from this:

k2pdfopt -mode fw -ls- twilight.pdf

Which converted pages don't you like?
willus is offline   Reply With Quote
Old 05-06-2020, 06:44 PM   #1787
Shohreh
Connoisseur
Shohreh understands the Henderson-Hasselbalch Equation.Shohreh understands the Henderson-Hasselbalch Equation.Shohreh understands the Henderson-Hasselbalch Equation.Shohreh understands the Henderson-Hasselbalch Equation.Shohreh understands the Henderson-Hasselbalch Equation.Shohreh understands the Henderson-Hasselbalch Equation.Shohreh understands the Henderson-Hasselbalch Equation.Shohreh understands the Henderson-Hasselbalch Equation.Shohreh understands the Henderson-Hasselbalch Equation.Shohreh understands the Henderson-Hasselbalch Equation.Shohreh understands the Henderson-Hasselbalch Equation.
 
Posts: 72
Karma: 85308
Join Date: Jan 2016
Device: none
Sorry: It's because I started this journey with Calibre to turn PDF into EPUB. After seeing that my e-reader could actually read PDF rather well except complicated layouts such as tables, I figured I could just turn those pages into bitmaps and generate an hybrid PDF (text/bitmaps).

And then it dawned on me that k2pdfopt made all this useless since it can massage the whole file.

As for the particular PDF I sent: It looks perfect on the computer (SumatraPDF), but on my 6 inch e-reader (768x1024, 212 DPI), some pages are a bit messy although, strangely enough, those only contain only text, in one column.

I used the command above:
Code:
k2pdfopt -mode fw -ls- twilight.pdf
Why does it occur, and what could I try?

https://postimg.cc/gallery/YLbzKXP
Shohreh is offline   Reply With Quote
Advert
Old 05-06-2020, 11:27 PM   #1788
willus
Fuzzball, the purple cat
willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.
 
willus's Avatar
 
Posts: 1,138
Karma: 8561592
Join Date: Jun 2011
Location: California
Device: iPad
Quote:
Originally Posted by Shohreh View Post
Sorry: It's because I started this journey with Calibre to turn PDF into EPUB. After seeing that my e-reader could actually read PDF rather well except complicated layouts such as tables, I figured I could just turn those pages into bitmaps and generate an hybrid PDF (text/bitmaps).

And then it dawned on me that k2pdfopt made all this useless since it can massage the whole file.

As for the particular PDF I sent: It looks perfect on the computer (SumatraPDF), but on my 6 inch e-reader (768x1024, 212 DPI), some pages are a bit messy although, strangely enough, those only contain only text, in one column.

I used the command above:
Code:
k2pdfopt -mode fw -ls- twilight.pdf
Why does it occur, and what could I try?

https://postimg.cc/gallery/YLbzKXP
Try turning off native mode so that you get bitmapped output. Your e-reader shouldn't be able to mess that up:

k2pdfopt -mode fw -ls- -n- twilight.pdf
willus is offline   Reply With Quote
Old 05-07-2020, 07:39 AM   #1789
Shohreh
Connoisseur
Shohreh understands the Henderson-Hasselbalch Equation.Shohreh understands the Henderson-Hasselbalch Equation.Shohreh understands the Henderson-Hasselbalch Equation.Shohreh understands the Henderson-Hasselbalch Equation.Shohreh understands the Henderson-Hasselbalch Equation.Shohreh understands the Henderson-Hasselbalch Equation.Shohreh understands the Henderson-Hasselbalch Equation.Shohreh understands the Henderson-Hasselbalch Equation.Shohreh understands the Henderson-Hasselbalch Equation.Shohreh understands the Henderson-Hasselbalch Equation.Shohreh understands the Henderson-Hasselbalch Equation.
 
Posts: 72
Karma: 85308
Join Date: Jan 2016
Device: none
Thanks for the tip.

As expected, the filesize is much bigger (6x).

FWIW, my e-reader offers a few settings when reading PDFs, one of them being a "Reflow text" option: The issue I had (some pages, with basic text, being a bit messy) occurs when I enable this option.

OTOH, with the "bitmapped file" (-mode fw -ls- -n-):
"Reflow text" enabled: The layout has even more issues than the native PDF (without -n-)
"Reflow text" disabled: Better, although I'll need to add a bit more margins because they're too close from the edge.

I'm surprised bitmaps can be reflowed at all, since they're just pictures, unlike native PDFs.

Out of curiosity, why does this layout issue occurs with some pages in native PDF, with pages that only contain text? I would have expected this issue with more complicated layout (tables, insets, etc.)
Shohreh is offline   Reply With Quote
Old 05-07-2020, 07:58 AM   #1790
willus
Fuzzball, the purple cat
willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.
 
willus's Avatar
 
Posts: 1,138
Karma: 8561592
Join Date: Jun 2011
Location: California
Device: iPad
Quote:
Originally Posted by Shohreh View Post
Thanks for the tip.

As expected, the filesize is much bigger (6x).

FWIW, my e-reader offers a few settings when reading PDFs, one of them being a "Reflow text" option: The issue I had (some pages, with basic text, being a bit messy) occurs when I enable this option.

OTOH, with the "bitmapped file" (-mode fw -ls- -n-):
"Reflow text" enabled: The layout has even more issues than the native PDF (without -n-)
"Reflow text" disabled: Better, although I'll need to add a bit more margins because they're too close from the edge.

I'm surprised bitmaps can be reflowed at all, since they're just pictures, unlike native PDFs.

Out of curiosity, why does this layout issue occurs with some pages in native PDF, with pages that only contain text? I would have expected this issue with more complicated layout (tables, insets, etc.)
You’d have to ask tech support for your e-reader why those issues come up, but like you said, it has to do with enabling the reflow text option. As you said, Sumatra shows the file okay, so it’s an issue with the PDF display software in your reader. I’m surprised if your reader is trying to modify a bitmapped page, but it’s possible—especially if it uses some fork of KOReader or Duokan. KOReader reflow actually uses code from k2pdfopt (ha ha).

You can increase the output margins with the -om command-line option, e.g. -om 0.2 will add 0.2 inches of padding around the output pages.

Last edited by willus; 05-07-2020 at 08:03 AM.
willus is offline   Reply With Quote
Advert
Old 05-07-2020, 08:46 AM   #1791
Shohreh
Connoisseur
Shohreh understands the Henderson-Hasselbalch Equation.Shohreh understands the Henderson-Hasselbalch Equation.Shohreh understands the Henderson-Hasselbalch Equation.Shohreh understands the Henderson-Hasselbalch Equation.Shohreh understands the Henderson-Hasselbalch Equation.Shohreh understands the Henderson-Hasselbalch Equation.Shohreh understands the Henderson-Hasselbalch Equation.Shohreh understands the Henderson-Hasselbalch Equation.Shohreh understands the Henderson-Hasselbalch Equation.Shohreh understands the Henderson-Hasselbalch Equation.Shohreh understands the Henderson-Hasselbalch Equation.
 
Posts: 72
Karma: 85308
Join Date: Jan 2016
Device: none
Thanks much!

--
Edit: I went through the list of commands, but didn't find how to do this.

Since only some pages are wrongly displayed on my e-reader, is it possible to have k2 only turn those into bitmaps, while leaving the rest as text (native PDF)?

And out of curiosity, would you have an idea why my e-reader goes crazy with those, although they only contain basic text (so shouldn't be a problem)?

Last edited by Shohreh; 05-07-2020 at 03:32 PM.
Shohreh is offline   Reply With Quote
Old 05-08-2020, 01:45 PM   #1792
willus
Fuzzball, the purple cat
willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.
 
willus's Avatar
 
Posts: 1,138
Karma: 8561592
Join Date: Jun 2011
Location: California
Device: iPad
Quote:
Originally Posted by Shohreh View Post
Thanks much!

--
Edit: I went through the list of commands, but didn't find how to do this.

Since only some pages are wrongly displayed on my e-reader, is it possible to have k2 only turn those into bitmaps, while leaving the rest as text (native PDF)?

And out of curiosity, would you have an idea why my e-reader goes crazy with those, although they only contain basic text (so shouldn't be a problem)?
The best option is to use native mode for the whole thing and try to tell your reader not to do any reflow or anything else to the file. There is no way to do native mode on some pages and bitmapped output on others--you would have to do it both ways, creating two separate output files, and then merge the pages from each output file that you want into one final file using a program like cpdf. As I said in a previous post, I can't explain why your particular e-reader is displaying the PDF strangely if a standard reader like Sumatra is not. It would have to do with the particular PDF display software installed on your reader.
willus is offline   Reply With Quote
Old 05-08-2020, 04:09 PM   #1793
Shohreh
Connoisseur
Shohreh understands the Henderson-Hasselbalch Equation.Shohreh understands the Henderson-Hasselbalch Equation.Shohreh understands the Henderson-Hasselbalch Equation.Shohreh understands the Henderson-Hasselbalch Equation.Shohreh understands the Henderson-Hasselbalch Equation.Shohreh understands the Henderson-Hasselbalch Equation.Shohreh understands the Henderson-Hasselbalch Equation.Shohreh understands the Henderson-Hasselbalch Equation.Shohreh understands the Henderson-Hasselbalch Equation.Shohreh understands the Henderson-Hasselbalch Equation.Shohreh understands the Henderson-Hasselbalch Equation.
 
Posts: 72
Karma: 85308
Join Date: Jan 2016
Device: none
Yes, disabling "Reflow text" and adding a .2 inch margin looks pretty good. Thank you.
Shohreh is offline   Reply With Quote
Old 06-12-2020, 03:15 PM   #1794
willus
Fuzzball, the purple cat
willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.
 
willus's Avatar
 
Posts: 1,138
Karma: 8561592
Join Date: Jun 2011
Location: California
Device: iPad
k2pdfopt v2.52 released

K2pdfopt v2.52 is released.
This is primarily a bug-fix release, fixing over 20 issues that have accumulated over time. There are also a few enhancements including the ability to directly download Tesseract OCR language data files (finally).
See details at the web site.
willus is offline   Reply With Quote
Old 06-13-2020, 05:18 PM   #1795
msh2050
Member
msh2050 is often consulted by the I Ching.msh2050 is often consulted by the I Ching.msh2050 is often consulted by the I Ching.msh2050 is often consulted by the I Ching.msh2050 is often consulted by the I Ching.msh2050 is often consulted by the I Ching.msh2050 is often consulted by the I Ching.msh2050 is often consulted by the I Ching.msh2050 is often consulted by the I Ching.msh2050 is often consulted by the I Ching.msh2050 is often consulted by the I Ching.
 
Posts: 20
Karma: 122330
Join Date: Sep 2017
Device: ipad , Kindle PW3
great work
I test the OCR it is working perfectly...
the program downloads the Arabic language from the web ...
the ocr text output file is ok
but the text inside the pdf is not in Arabic (my be it is encoding issue)

regards

Last edited by msh2050; 06-13-2020 at 05:23 PM.
msh2050 is offline   Reply With Quote
Old 06-13-2020, 07:27 PM   #1796
willus
Fuzzball, the purple cat
willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.
 
willus's Avatar
 
Posts: 1,138
Karma: 8561592
Join Date: Jun 2011
Location: California
Device: iPad
Quote:
Originally Posted by msh2050 View Post
great work
I test the OCR it is working perfectly...
the program downloads the Arabic language from the web ...
the ocr text output file is ok
but the text inside the pdf is not in Arabic (my be it is encoding issue)

regards
I'm not sure what you mean by "the text inside the PDF is not in Arabic." The font used for the OCR layer is not an Arabic font--that is true, but if you copy and paste the text from the PDF file to another application like MS Word or Google, you should see the correct Arabic characters. And searching the PDF should also work correctly.
willus is offline   Reply With Quote
Old 06-17-2020, 08:20 AM   #1797
MarjaE
Fanatic
MarjaE ought to be getting tired of karma fortunes by now.MarjaE ought to be getting tired of karma fortunes by now.MarjaE ought to be getting tired of karma fortunes by now.MarjaE ought to be getting tired of karma fortunes by now.MarjaE ought to be getting tired of karma fortunes by now.MarjaE ought to be getting tired of karma fortunes by now.MarjaE ought to be getting tired of karma fortunes by now.MarjaE ought to be getting tired of karma fortunes by now.MarjaE ought to be getting tired of karma fortunes by now.MarjaE ought to be getting tired of karma fortunes by now.MarjaE ought to be getting tired of karma fortunes by now.
 
Posts: 586
Karma: 2200000
Join Date: Jun 2015
Device: Iriver Story HD, Amazon Kindle 5, and Amazon Kindle DX
Hi,

Aside from -cmax and wt+, are there other tricks to increase contrast between text and shaded partial backgrounds? And is it possible to output black and white instead of grayscale or color? (I'm currently using a low value for -wt+ and it looks like black and while, but I'm not sure if it's just darker shades of gray and white.)

Last edited by MarjaE; 06-17-2020 at 08:26 AM.
MarjaE is offline   Reply With Quote
Old 06-17-2020, 09:37 PM   #1798
willus
Fuzzball, the purple cat
willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.
 
willus's Avatar
 
Posts: 1,138
Karma: 8561592
Join Date: Jun 2011
Location: California
Device: iPad
Quote:
Originally Posted by MarjaE View Post
Hi,

Aside from -cmax and wt+, are there other tricks to increase contrast between text and shaded partial backgrounds? And is it possible to output black and white instead of grayscale or color? (I'm currently using a low value for -wt+ and it looks like black and while, but I'm not sure if it's just darker shades of gray and white.)
You can see if adjusting the gamma value helps (-g option). You can do black and white by setting bits per color to 2: -bpc 2.
willus is offline   Reply With Quote
Old 06-23-2020, 03:21 PM   #1799
Gaaks
Junior Member
Gaaks began at the beginning.
 
Posts: 1
Karma: 10
Join Date: Jun 2020
Device: KT4
Hello Willus, first I'd like to thank you for making this software, it's simply amazing, pretty much all material are in pdf format and k2pdfopt has made some unreadable think on kindle amazing to read.

But I am having problems with photo-copied/scanned material, the kind where there are two pages side by side, I tried using the 2 column feature to split the pages, but I guess that because the scans are low quality the detection isn't working well, some pages get split in the wrong areas and others just don't get split at all.

What I've been doing is, using the crop areas feature and selecting each page, and it works well, the output is in the correct sequence and it treats each crop as it's page, instead of processing the entire image, the problem here is simply that, I can't set a crop for each file, so because of their lack of centering and different sizes, the crop never lines up properly, so I have to do them one by one.

I've also been running into the issue of text getting split in badly centered pdfs, I've been putting -f2p -1 into the additional options like you recommend in the FAQ, but I couldn't the "bp" options in the gui.

And another less common problem is that, there are some high quality color scans, that have the beige color of the page, and the output ends up being grayscale with a gray background which is annoying to read, and I wanted to know if there's a way to make the output monochrome, -bpc 1 sometimes works but the text quality is really, and sometimes the output just comes out completely mangled, I've tried -wt[+] <threshold>, but I think I don't understand how to use that command, because even -wt[+] 0 doesn't output pure white, I was hoping to do something something like -wt[+] 60 since it's a rather dark grey.

I wanted to ask if you have some advice to get better results there, I'm using the GUI version provided on the website.
Gaaks is offline   Reply With Quote
Old 06-27-2020, 09:17 PM   #1800
willus
Fuzzball, the purple cat
willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.
 
willus's Avatar
 
Posts: 1,138
Karma: 8561592
Join Date: Jun 2011
Location: California
Device: iPad
@Gaaks--it would be most helpful if you could post some examples from the source PDFs that you discussed so that I could try some different options on them.
(Another way would be to send me samples via private message.)

Last edited by willus; 06-28-2020 at 08:31 AM.
willus is offline   Reply With Quote
Reply

Tags
ebook apps, k5 tools, kindle tools, kindle touch, tools

Thread Tools Search this Thread
Search this Thread:

Advanced Search

Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Viewing PDFs with another font Font PocketBook 4 11-12-2010 08:27 AM
Viewing Textbook PDFs... NJReader enTourage Archive 4 08-17-2010 05:17 PM
PRS-600 Restart bug while viewing PDFs? conundrum Sony Reader 2 03-04-2010 08:46 PM
More on viewing pdfs dso371 Bookeen 8 03-11-2008 07:15 PM
Viewing Untagged PDFs on Palm T|X Eroica Reading and Management 3 12-10-2007 01:44 PM


All times are GMT -4. The time now is 11:49 PM.


MobileRead.com is a privately owned, operated and funded community.