![]() |
#1786 | |
Fuzzball, the purple cat
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 1,303
Karma: 11087488
Join Date: Jun 2011
Location: California
Device: iPad
|
Quote:
Also, I don't think you need to use the -m option since your PDF is very clean (not scanned). k2pdfopt will automatically trim away the margins in the fit-width mode (-fw). What is wrong with the output from this: k2pdfopt -mode fw -ls- twilight.pdf Which converted pages don't you like? |
|
![]() |
![]() |
![]() |
#1787 |
Addict
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 207
Karma: 304158
Join Date: Jan 2016
Location: France
Device: none
|
Sorry: It's because I started this journey with Calibre to turn PDF into EPUB. After seeing that my e-reader could actually read PDF rather well except complicated layouts such as tables, I figured I could just turn those pages into bitmaps and generate an hybrid PDF (text/bitmaps).
And then it dawned on me that k2pdfopt made all this useless since it can massage the whole file. As for the particular PDF I sent: It looks perfect on the computer (SumatraPDF), but on my 6 inch e-reader (768x1024, 212 DPI), some pages are a bit messy although, strangely enough, those only contain only text, in one column. I used the command above: Code:
k2pdfopt -mode fw -ls- twilight.pdf https://postimg.cc/gallery/YLbzKXP |
![]() |
![]() |
![]() |
#1788 | |
Fuzzball, the purple cat
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 1,303
Karma: 11087488
Join Date: Jun 2011
Location: California
Device: iPad
|
Quote:
k2pdfopt -mode fw -ls- -n- twilight.pdf |
|
![]() |
![]() |
![]() |
#1789 |
Addict
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 207
Karma: 304158
Join Date: Jan 2016
Location: France
Device: none
|
Thanks for the tip.
As expected, the filesize is much bigger (6x). FWIW, my e-reader offers a few settings when reading PDFs, one of them being a "Reflow text" option: The issue I had (some pages, with basic text, being a bit messy) occurs when I enable this option. OTOH, with the "bitmapped file" (-mode fw -ls- -n-): "Reflow text" enabled: The layout has even more issues than the native PDF (without -n-) "Reflow text" disabled: Better, although I'll need to add a bit more margins because they're too close from the edge. I'm surprised bitmaps can be reflowed at all, since they're just pictures, unlike native PDFs. Out of curiosity, why does this layout issue occurs with some pages in native PDF, with pages that only contain text? I would have expected this issue with more complicated layout (tables, insets, etc.) |
![]() |
![]() |
![]() |
#1790 | |
Fuzzball, the purple cat
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 1,303
Karma: 11087488
Join Date: Jun 2011
Location: California
Device: iPad
|
Quote:
You can increase the output margins with the -om command-line option, e.g. -om 0.2 will add 0.2 inches of padding around the output pages. Last edited by willus; 05-07-2020 at 08:03 AM. |
|
![]() |
![]() |
![]() |
#1791 |
Addict
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 207
Karma: 304158
Join Date: Jan 2016
Location: France
Device: none
|
Thanks much!
-- Edit: I went through the list of commands, but didn't find how to do this. Since only some pages are wrongly displayed on my e-reader, is it possible to have k2 only turn those into bitmaps, while leaving the rest as text (native PDF)? And out of curiosity, would you have an idea why my e-reader goes crazy with those, although they only contain basic text (so shouldn't be a problem)? Last edited by Shohreh; 05-07-2020 at 03:32 PM. |
![]() |
![]() |
![]() |
#1792 | |
Fuzzball, the purple cat
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 1,303
Karma: 11087488
Join Date: Jun 2011
Location: California
Device: iPad
|
Quote:
|
|
![]() |
![]() |
![]() |
#1793 |
Addict
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 207
Karma: 304158
Join Date: Jan 2016
Location: France
Device: none
|
Yes, disabling "Reflow text" and adding a .2 inch margin looks pretty good. Thank you.
|
![]() |
![]() |
![]() |
#1794 |
Fuzzball, the purple cat
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 1,303
Karma: 11087488
Join Date: Jun 2011
Location: California
Device: iPad
|
k2pdfopt v2.52 released
K2pdfopt v2.52 is released.
This is primarily a bug-fix release, fixing over 20 issues that have accumulated over time. There are also a few enhancements including the ability to directly download Tesseract OCR language data files (finally). See details at the web site. |
![]() |
![]() |
![]() |
#1795 |
Enthusiast
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 27
Karma: 122330
Join Date: Sep 2017
Device: ipad , Kindle PW3
|
great work
I test the OCR it is working perfectly... the program downloads the Arabic language from the web ... the ocr text output file is ok but the text inside the pdf is not in Arabic (my be it is encoding issue) regards Last edited by msh2050; 06-13-2020 at 05:23 PM. |
![]() |
![]() |
![]() |
#1796 |
Fuzzball, the purple cat
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 1,303
Karma: 11087488
Join Date: Jun 2011
Location: California
Device: iPad
|
I'm not sure what you mean by "the text inside the PDF is not in Arabic." The font used for the OCR layer is not an Arabic font--that is true, but if you copy and paste the text from the PDF file to another application like MS Word or Google, you should see the correct Arabic characters. And searching the PDF should also work correctly.
|
![]() |
![]() |
![]() |
#1797 |
Guru
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 937
Karma: 53902736
Join Date: Jun 2015
Device: multiple
|
Hi,
Aside from -cmax and wt+, are there other tricks to increase contrast between text and shaded partial backgrounds? And is it possible to output black and white instead of grayscale or color? (I'm currently using a low value for -wt+ and it looks like black and while, but I'm not sure if it's just darker shades of gray and white.) Last edited by MarjaE; 06-17-2020 at 08:26 AM. |
![]() |
![]() |
![]() |
#1798 | |
Fuzzball, the purple cat
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 1,303
Karma: 11087488
Join Date: Jun 2011
Location: California
Device: iPad
|
Quote:
|
|
![]() |
![]() |
![]() |
#1799 |
Member
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 10
Karma: 49658
Join Date: Jun 2020
Device: KT4
|
Hello Willus, first I'd like to thank you for making this software, it's simply amazing, pretty much all material are in pdf format and k2pdfopt has made some unreadable think on kindle amazing to read.
But I am having problems with photo-copied/scanned material, the kind where there are two pages side by side, I tried using the 2 column feature to split the pages, but I guess that because the scans are low quality the detection isn't working well, some pages get split in the wrong areas and others just don't get split at all. What I've been doing is, using the crop areas feature and selecting each page, and it works well, the output is in the correct sequence and it treats each crop as it's page, instead of processing the entire image, the problem here is simply that, I can't set a crop for each file, so because of their lack of centering and different sizes, the crop never lines up properly, so I have to do them one by one. I've also been running into the issue of text getting split in badly centered pdfs, I've been putting -f2p -1 into the additional options like you recommend in the FAQ, but I couldn't the "bp" options in the gui. And another less common problem is that, there are some high quality color scans, that have the beige color of the page, and the output ends up being grayscale with a gray background which is annoying to read, and I wanted to know if there's a way to make the output monochrome, -bpc 1 sometimes works but the text quality is really, and sometimes the output just comes out completely mangled, I've tried -wt[+] <threshold>, but I think I don't understand how to use that command, because even -wt[+] 0 doesn't output pure white, I was hoping to do something something like -wt[+] 60 since it's a rather dark grey. I wanted to ask if you have some advice to get better results there, I'm using the GUI version provided on the website. |
![]() |
![]() |
![]() |
#1800 |
Fuzzball, the purple cat
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 1,303
Karma: 11087488
Join Date: Jun 2011
Location: California
Device: iPad
|
@Gaaks--it would be most helpful if you could post some examples from the source PDFs that you discussed so that I could try some different options on them.
(Another way would be to send me samples via private message.) Last edited by willus; 06-28-2020 at 08:31 AM. |
![]() |
![]() |
![]() |
Tags |
ebook apps, k5 tools, kindle tools, kindle touch, tools |
|
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
Viewing PDFs with another font | Font | PocketBook | 4 | 11-12-2010 08:27 AM |
Viewing Textbook PDFs... | NJReader | enTourage Archive | 4 | 08-17-2010 05:17 PM |
PRS-600 Restart bug while viewing PDFs? | conundrum | Sony Reader | 2 | 03-04-2010 08:46 PM |
More on viewing pdfs | dso371 | Bookeen | 8 | 03-11-2008 07:15 PM |
Viewing Untagged PDFs on Palm T|X | Eroica | Reading and Management | 3 | 12-10-2007 01:44 PM |