04-09-2015, 08:42 AM | #1036 | |
Fuzzball, the purple cat
Posts: 1,272
Karma: 11087488
Join Date: Jun 2011
Location: California
Device: iPad
|
Quote:
Also, your overlays look quite consistent--I can see a clear boundary around the region to ignore, and I can also see a pretty clear boundary between the top and bottom text regions. Are you sure you can't automate with something like this? -cbox1o 0s,0s,.5s,.5s -cbox1o 0,.5s -cbox2e 0.5s,0s,.5s,.5s -cbox2e 0,.5s The values might not quite be right, but they should be close. I'm also not sure if I got the even/odd pages correctly correlated with which side the "ignore box" is on. (If you leave off the width and height values from -cbox, it will default to extend to the edge of the page.) If you want to convert your PDF to bitmaps, there are multiple programs that can do this. You could certainly write your own using the MuPDF library, or, like you said, it would be a trivial feature to add to k2pdfopt (it can already reassemble bitmaps into a PDF--just feed it a folder of .png or .jpg files named sequentially in the order you want them processed). The "convert" program from ImageMagick will also do this: convert file.pdf file.png ... will create file-001.png, file-002.png, ... It is a powerful bitmap conversion command with many options. Last edited by willus; 04-09-2015 at 08:45 AM. |
|
04-10-2015, 05:14 AM | #1037 |
Enthusiast
Posts: 29
Karma: 100000
Join Date: Oct 2013
Device: kindle
|
Kindle paper-white problem
Hello Willus,
I am facing a problem which has nothing to do with your excellent software, but with your knowledge on pdf files, you may be able to give some advice. I have a kindle paper-white (2nd generation) for reading my books. I mainly read Greek literature (since I am Greek) which I find in pdf format. The problem is that most of those pdf files are not properly displayed on kindle; some characters are just not shown. Those are Greek characters (for example Greek “e-epsilon” ,“k-kapa” or “s-sigma” or vowels with accentuation). I managed to partially overcome this problem by saving the file using “Adobe Acrobat Pro X” software as optimized pdf with specific settings. Then everything is displayed properly on kindle but the file is 5 to 10 times larger than original and less clear. The funny thing is that on kindle keyboard everything is perfect. I talk to Amazon customer service but they could not find a solution. Their last word was that after all pdf reader is an experimental one. I am thinking that there may be an other way/software to use for making those “missing” characters display on kindle, without side-effects... I am attaching for you, a test page on pdf file, a screenshot of how it is displayed on kindle, and the settings I use to optimize the file. Any advice would be appreciated. Thank you for your time. Angelos |
04-10-2015, 08:47 AM | #1038 | |
Fuzzball, the purple cat
Posts: 1,272
Karma: 11087488
Join Date: Jun 2011
Location: California
Device: iPad
|
Quote:
I'm guessing it's some kind of font rendering issue, but I can't be sure without seeing the original file. It looks like what Adobe may be doing is converting to bitmaps (at least--that's what is in your two attached PDFs--they have bitmapped pages, which should be displayed faithfully on any kindle). If it comes down to having to convert to bitmaps then yes, your resulting PDF file will be larger depending on the resolution, pixel depth, and compression settings of the bitmaps. You can tweak these things, particularly if you use k2pdfopt (-mode copy), but you'll likely have to live with a larger file size. Still--post the originals. Maybe there's a way to fix and/or embed the font? |
|
04-10-2015, 09:02 AM | #1039 |
Enthusiast
Posts: 29
Karma: 100000
Join Date: Oct 2013
Device: kindle
|
[QUOTE=willus;3080600]Hello Angelos. Welcome back. I'm confused. The "test" attachment--is that from your original PDF file (which doesn't display correctly on the paperwhite), or is that after processing with Adobe? Same with your screen shot--is that a screen shot of an Adobe-processed file? If it is, can you also post a sample from the original PDF file (that doesn't display correctly) and also a screen shot from it? I don't see in the screen shot you posted where it doesn't match the PDF files.
I'm guessing it's some kind of font rendering issue, but I can't be sure without seeing the original file. It looks like what Adobe may be doing is converting to bitmaps (at least--that's what is in your two attached PDFs--they have bitmapped pages, which should be displayed faithfully on any kindle). If it comes down to having to convert to bitmaps then yes, your resulting PDF file will be larger depending on the resolution, pixel depth, and compression settings of the bitmaps. You can tweak these things, particularly if you use k2pdfopt (-mode copy), but you'll likely have to live with a larger file size. Still--post the originals. Maybe there's a way to fix and/or embed the font?[/QUO Hello Willus, I am sorry, I attached before wrong file. I attach know the correct original file (test1.pdf) and the file after optimization (test1 optimized.pdf) |
04-10-2015, 08:44 PM | #1040 |
Fuzzball, the purple cat
Posts: 1,272
Karma: 11087488
Join Date: Jun 2011
Location: California
Device: iPad
|
Angelos--the file you attached (test1.pdf) is still just a bitmapped PDF. Are you sure it does not display correctly on your Paperwhite? Can you please attach (1) a screen shot of a PDF not being displayed correctly on your paperwhite, and (2) the source PDF file which does not display correctly?
|
04-10-2015, 10:47 PM | #1041 |
Member
Posts: 17
Karma: 16138
Join Date: Mar 2015
Device: none
|
Hello, Willus,
Thanks for your help. Now I have some more questions: (1) When re-flowing partially filled, center aligned text with extra justification option, I could not align it either right or left. It just tries to stay in center aligned. The k2pdfopt program might think (it is what I assume actually) it is a center aligned title or heading line. Is it true? Is there any way to forcibly align it? (2) When reflowing text, if some words at the end of the current line in original text do no fit there, it will be wrapped to next line, partially filling the line, in which there are some space. Now the next line in original text starts from a fresh 3rd line, not using the space left in previous line. Is it the intended way of the program or did I miss some extra options? Let me illustrate it: Original text: ------------------------------------- first word second word some more words second line some words some more ... ------------------------------------- Re-flown result: ------------------------------------- first word second word some more words second line some words some more ... ------------------------------------- What I wanted: ------------------------------------- first word second word some more words second line some words some more ... ------------------------------------- (3) Is there any way to find out bitmap pdf's original resolution? Let's assume I have a 1000 pixel wide bitmap picture, I converted it to a 10 inch wide pdf file, thus effectively having a 100 ppi resolution bitmap pdf. Later for some reason, I need to convert this pdf to bitmap. While converting, I set the resolution to be 200 ppi (or dpi), having a 2000 pixel wide bitmap. Is this "higher resolution" bitmap picture any better than the original one? I think it is a good question. As when we use the program, we have conflicting goal of having smaller file size with higher resolution. If we reflow some previously-generated, resolution-unknown bitmap pdf, we could set the resolution as high as the original resolution, but not higher. If we set higher, what I think is that it just increases file size, but not resolution. Am I right? Thanks. |
04-11-2015, 02:48 AM | #1042 | |
Enthusiast
Posts: 29
Karma: 100000
Join Date: Oct 2013
Device: kindle
|
Quote:
Willus, I have attached a screenshot from kindle in my first post. For your convenience I re-attache it again. Check for example line #4, word #6: The word "βρίσκονται" is displayed as "βρίσ νται". This happens on every other line; you may not easily spot the differences because you don't read Greek. Regards, Angelos P.S. I also attach the following files: "test1.pdf" : original file "test1 optimized.pdf" : file properly displayed on kindle after saved as optimized pdf in Adobe Acrobat Pro "test1 underlined missing characters.pdf" : original file with characters not displayed on kindle underlined, for easier inspection Last edited by agelos100; 04-11-2015 at 03:32 AM. |
|
04-11-2015, 08:27 AM | #1043 |
Fuzzball, the purple cat
Posts: 1,272
Karma: 11087488
Join Date: Jun 2011
Location: California
Device: iPad
|
Angelos--Thank you for your patience. I see the issue now. You are exactly right--I don't read Greek so it was not obvious to me. Since your PDF is simply storing JBIG images, I'm thinking the paperwhite PDF reader must have a bug in the JBIG decoder. Hopefully Amazon will fix it in a firmware rev. By "optimizing" you are changing the bitmap encoding and that's fixing your problem. I'll see if I can recommend any better settings.
|
04-11-2015, 10:32 AM | #1044 | |
Fuzzball, the purple cat
Posts: 1,272
Karma: 11087488
Join Date: Jun 2011
Location: California
Device: iPad
|
Quote:
PDF-1.6 Info object (21 0 R): <</CreationDate(D:20131209141435+02'00')/Creator(abs)/ModDate(D:20150410155724+03'00')/Producer(K2pdfopt v2.12)/Title<>>> Pages: 1 Retrieving info from pages 1-1... Mediaboxes (1): 1 (1 0 R): [ 0 0 223.4717 301.92452 ] Images (1): 1 (1 0 R): [ JBIG2 ] 1861x2842 1bpc ImageMask (15 0 R) Form xobjects (1): 1 (1 0 R): Form (13 0 R) Last edited by willus; 04-11-2015 at 10:48 AM. |
|
04-11-2015, 10:41 AM | #1045 | |
Fuzzball, the purple cat
Posts: 1,272
Karma: 11087488
Join Date: Jun 2011
Location: California
Device: iPad
|
Quote:
-j -1|0|1|2[+/-] Set output text justification. 0 = left, 1 = center, 2 = right. Add a + to attempt full justification or a - to explicitly turn it off. The default is -1, which tells k2pdfopt to try and maintain the justification of the document as it is. See also -wrap. There is some fuzzy logic in k2pdfopt with the re-flow. If it thinks that one or more lines are significantly shorter than neighboring lines, then it won't re-flow to the next line because it assumes that these short lines are terminating a paragraph and/or are not meant to flow to the next line. This is probably what is happening in your second case. As always it helps me if you can post a source PDF file so that I can experiment with options that may help. Did you use -wrap+ (the GUI default)? The + will tell k2pdfopt to un-wrap short lines in cases where k2pdfopt determines that the neighboring lines are part of the same paragraph. (3): The "pdfinfo" program from the MuPDF distro will tell you what is in a PDF file, including the bitmaps and what resolution they are. You can get a win64 version here. It runs from the command line. You've given me a good idea, though--I'll try to add a feature to the next rev of k2pdfopt (and the MS Windows GUI) which will report the information from the PDF file (like pdfinfo). You are correct that there is no point to using an output dpi significantly higher than the source dpi if the source PDF has only bitmapped pages. Last edited by willus; 04-11-2015 at 10:46 AM. |
|
04-12-2015, 02:55 PM | #1046 |
Member
Posts: 15
Karma: 25800
Join Date: Dec 2014
Device: Nook Simple Touch
|
Willus, we are getting impatient about the new release
|
04-12-2015, 03:54 PM | #1047 |
Fuzzball, the purple cat
Posts: 1,272
Karma: 11087488
Join Date: Jun 2011
Location: California
Device: iPad
|
|
04-13-2015, 04:35 AM | #1048 | |
Enthusiast
Posts: 29
Karma: 100000
Join Date: Oct 2013
Device: kindle
|
Quote:
Thank you for your time regarding the problem I have with kindle PW. I had a chat for second time with Amazon kindle support; they are very friendly but they don't know/understand much about technical issues. They promised to forward the problem to their specialists and they will eventually let me know about it. With regard to your suggestion to start a new thread, to be honest I do not exactly understand the nature of the problem, I don't know about the structure of pdf files, and for me it would be wrong to start asking details for a problem I cannot comprehend. If Amazon comes with a follow up email, I will let you know. Thanks again for your time. Angelos |
|
04-22-2015, 12:19 PM | #1049 |
Junior Member
Posts: 4
Karma: 13000
Join Date: Apr 2015
Device: Kindle 4 (...probably)
|
Hi Willus,
your software is pretty impressive! Thank you a lot. I couldn't manage to find my way around one problem, though: When I want to use it in files of the type '1.pdf' (see attachment), with columns, images occupying more than one column and few space between the text and the image, the software doesn't recognize the text around the image (it also happens when the remaining column keeps the normal column with of the text). Therefore, it takes the picture + text around it as an image (see page 3 in document '2.pdf' attached). This, besides leaving this text small, messes up the whole order of the text, as you can see in the document '3 (1 marked).pdf' from the attached files. Do you have any idea how I could manage to get around this problem? Thank you again! This kind of programs solving little and sometimes hard but very annoying issues --like yours-- make my life easier and happier m |
04-23-2015, 12:37 AM | #1050 | |
Fuzzball, the purple cat
Posts: 1,272
Karma: 11087488
Join Date: Jun 2011
Location: California
Device: iPad
|
Quote:
k2pdfopt -ch 1 -col 4 -cg .05 -cgr .5 -crgh .01 -comax .3 -evl 2 -sm 1.pdf ...but it still has things out of order. Of course, if you let me cheat, then I can put cropboxes around each part, but that's pretty labor intensive: k2pdfopt -cbox .55in,.2in,4.23in,1.31in -cbox .55in,1.5in,2.41in,1.82in -cbox .55in,3.31in,1.71in,3.95in -cbox .55in,7.25in,2.38in,3.23in -cbox 2.92in,1.53in,2.42in,1.82in -cbox 2.25in,3.39in,3.69in,3.76in -cbox 2.92in,7.22in,2.39in,3.25in -cbox 5.37in,1.52in,2.33in,1.79in -cbox 5.94in,3.28in,1.75in,3.98in -cbox 5.34in,7.25in,2.36in,3.27in -fc- -sm 1.pdf I'll try to think about ways I could get k2pdfopt to handle cases like this better. I may even have some options in there to help with this, but I'm not remembering... |
|
Tags |
ebook apps, k5 tools, kindle tools, kindle touch, tools |
Thread Tools | Search this Thread |
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
Viewing PDFs with another font | Font | PocketBook | 4 | 11-12-2010 08:27 AM |
Viewing Textbook PDFs... | NJReader | enTourage Archive | 4 | 08-17-2010 05:17 PM |
PRS-600 Restart bug while viewing PDFs? | conundrum | Sony Reader | 2 | 03-04-2010 08:46 PM |
More on viewing pdfs | dso371 | Bookeen | 8 | 03-11-2008 07:15 PM |
Viewing Untagged PDFs on Palm T|X | Eroica | Reading and Management | 3 | 12-10-2007 01:44 PM |