Register Guidelines E-Books Search Today's Posts Mark Forums Read

Go Back   MobileRead Forums > E-Book Formats > PDF

Notices

Reply
 
Thread Tools Search this Thread
Old 09-18-2012, 09:29 PM   #166
willus
Fuzzball, the purple cat
willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.
 
willus's Avatar
 
Posts: 1,272
Karma: 11087488
Join Date: Jun 2011
Location: California
Device: iPad
Working with multicolumn scanned file

Quote:
Originally Posted by dianner View Post
Hi - I tried to convert a PDF file with this tool using the default options and it doesn't seem to work. This PDF file has 2 columns on some pages and 3 columns on others. I expected the pages to be 1 column, but they continued to be 2 and 3. I also tried specifying "co" and "1" on the interactive display, with the same result. I believe this PDF file is created from a hard copy document being scanned in. Could this be a problem? Or, is there some other option to specify? I've looked at the help for the options, but I can't see any that make sense. Thanks for any help.
Dianner--k2pdfopt should be able to break out the columns whether or not your document is scanned. Note that for 3 columns, you need to use "-col 4" on the command line (or "co" and "4" on the interactive display). It may be that something in your document is preventing k2pdfopt from recognizing the individual columns. If you can post or attach an example of your source PDF file (can be just a couple pages), that would really help. I'll take a look and see if I can figure out why it's not working.
willus is offline   Reply With Quote
Old 09-19-2012, 11:28 AM   #167
dianner
Junior Member
dianner began at the beginning.
 
Posts: 4
Karma: 10
Join Date: Sep 2012
Device: Kindle Touch
Quote:
Originally Posted by willus View Post
Dianner--k2pdfopt should be able to break out the columns whether or not your document is scanned. Note that for 3 columns, you need to use "-col 4" on the command line (or "co" and "4" on the interactive display). It may be that something in your document is preventing k2pdfopt from recognizing the individual columns. If you can post or attach an example of your source PDF file (can be just a couple pages), that would really help. I'll take a look and see if I can figure out why it's not working.
I'm attaching one of the files as you requested. If there's something that the originator of the file can fix, let me know and I'll provide feedback. Thanks for taking a look at this. - Dianne
Attached Files
File Type: pdf web2012-09.pdf (1.18 MB, 543 views)
dianner is offline   Reply With Quote
Advert
Old 09-19-2012, 11:27 PM   #168
willus
Fuzzball, the purple cat
willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.
 
willus's Avatar
 
Posts: 1,272
Karma: 11087488
Join Date: Jun 2011
Location: California
Device: iPad
Example use of -evl option

Quote:
Originally Posted by dianner View Post
I'm attaching one of the files as you requested. If there's something that the originator of the file can fix, let me know and I'll provide feedback. Thanks for taking a look at this. - Dianne
Dianne--This is a tough document--lots of specialized formatting, varying column widths, and pretty densely packed text. I'm not sure you'll be satisfied with the result, but with the options below I thought k2pdfopt did an admirable job of formatting it for a smaller screen. Pages 10 and 11 got a bit butchered, but otherwise it separated the columns pretty well. I attached the kindle-optimized output and the one that shows how k2pdfopt flowed the pages. The options I used are:

k2pdfopt -evl 2 -col 4 -ch 0.65 -cg .08 -sm web2012-09.pdf

The -evl 2 option (interactive menu option "e") is the critical one--it erases vertical lines that divide the columns in your document, and this allows k2pdfopt to find the gaps between the columns. This is a new feature in k2pdfopt v1.50, and you gave me a perfect example to test it on. The -col 4 tells k2pdfopt to find up to 4 columns. The -ch 0.65 option (interactive menu under "co") sets the minimum height a region has to be in order to allow separation into different columns. The -cg .08 option (again under "co" in the interactive menu) sets the minimum gap between regions to 0.08 inches in order to consider them separate columns. And -sm shows how k2 flows the document in the web2012-09_marked.pdf file.

To see how to set these options as the default (or set up a custom shortcut that will use them), see "customizing k2pdfopt" on my help pages.

In general, if you want k2pdfopt to be able to follow the flow of the document correctly, it helps to have the columns well separated from each other, uniform in width, and with clear gaps separating distinct regions of the document. Getting rid of the vertical line separators would prevent having to use the -evl option. You can get an idea of what worked by looking at the "marked" file. Page 10 and 11, in particular, were too much for my algorithms, as you can tell.
Attached Files
File Type: pdf web2012-09_k2opt.pdf (3.53 MB, 421 views)
File Type: pdf web2012-09_marked.pdf (5.11 MB, 458 views)

Last edited by willus; 09-19-2012 at 11:50 PM. Reason: Added -cg .08 to options.
willus is offline   Reply With Quote
Old 09-21-2012, 12:24 PM   #169
dianner
Junior Member
dianner began at the beginning.
 
Posts: 4
Karma: 10
Join Date: Sep 2012
Device: Kindle Touch
Wow - This tool did an amazing job on this document! I was skeptical since it is complex. I'm ok with the parts that didn't format very well. I usually work with those on my laptop. But, I wanted to read the rest of it on my Kindle and now I can. I guess I didn't look closely enough at the -evl option, which seems to be the key. Thanks for doing this for me. You have a great tool. I'm going to post this info on the message boards for this newsletter. Thanks again - Dianne
dianner is offline   Reply With Quote
Old 09-21-2012, 09:13 PM   #170
willus
Fuzzball, the purple cat
willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.
 
willus's Avatar
 
Posts: 1,272
Karma: 11087488
Join Date: Jun 2011
Location: California
Device: iPad
k2pdfopt v1.51 released

K2pdfopt v1.51 is released. This is a bug-fix / minor improvement release. See the web site for more details.
willus is offline   Reply With Quote
Advert
Old 09-22-2012, 01:13 PM   #171
dianner
Junior Member
dianner began at the beginning.
 
Posts: 4
Karma: 10
Join Date: Sep 2012
Device: Kindle Touch
Quote:
Originally Posted by dianner View Post
Wow - This tool did an amazing job on this document! I was skeptical since it is complex. I'm ok with the parts that didn't format very well. I usually work with those on my laptop. But, I wanted to read the rest of it on my Kindle and now I can. I guess I didn't look closely enough at the -evl option, which seems to be the key. Thanks for doing this for me. You have a great tool. I'm going to post this info on the message boards for this newsletter. Thanks again - Dianne
Well, now that I looked more closely I see the problem on page 10. It looks like the problem occurs when there's a table or chart that is in the middle of the page that splits 2 columns. It wasn't too bad in this document, since that only occurred on 1 page. But, I just downloaded the Oct. newsletter and there are 5 pages that have that format. So, I'm a little discouraged. I'm asking the originators of the newsletter to avoid this format.
dianner is offline   Reply With Quote
Old 09-22-2012, 03:00 PM   #172
willus
Fuzzball, the purple cat
willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.
 
willus's Avatar
 
Posts: 1,272
Karma: 11087488
Join Date: Jun 2011
Location: California
Device: iPad
Quote:
Originally Posted by dianner View Post
Well, now that I looked more closely I see the problem on page 10. It looks like the problem occurs when there's a table or chart that is in the middle of the page that splits 2 columns. It wasn't too bad in this document, since that only occurred on 1 page. But, I just downloaded the Oct. newsletter and there are 5 pages that have that format. So, I'm a little discouraged. I'm asking the originators of the newsletter to avoid this format.
Dianne--please check your private messages.

Last edited by willus; 09-22-2012 at 09:09 PM.
willus is offline   Reply With Quote
Old 09-26-2012, 02:32 AM   #173
saeed.geek
Junior Member
saeed.geek began at the beginning.
 
Posts: 3
Karma: 10
Join Date: Sep 2012
Device: Nokia 5233
Hi Willus. thank you for your good app.
a question:
I have a pdf file,I adjusted the output width but each input page split to 2 or 3 output pages.I want each output page be same as input page with 2x zoom, how do I it?
And a suggestion:
please add night mod(black background, white text) to options of k2pdfopt.
saeed.geek is offline   Reply With Quote
Old 09-26-2012, 09:05 AM   #174
willus
Fuzzball, the purple cat
willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.
 
willus's Avatar
 
Posts: 1,272
Karma: 11087488
Join Date: Jun 2011
Location: California
Device: iPad
Quote:
Originally Posted by saeed.geek View Post
I want each output page be same as input page with 2x zoom, how do I it?
And a suggestion:
please add night mod(black background, white text) to options of k2pdfopt.
I guess I'm not sure what you are asking. Do you want one output page for each input page, or two output pages for each input page? Either way, you should be able to adjust your device resolution settings: width (-w), height (-h), and output dpi (-odpi) to get what you want. You may want to turn wrapping off (-wrap-) and column detection off (-col 1) to prevent k2pdfopt from re-organizing your page--see my faq. I will add black on white processing to the wish list.
willus is offline   Reply With Quote
Old 09-26-2012, 05:06 PM   #175
erpel
Junior Member
erpel began at the beginning.
 
Posts: 1
Karma: 10
Join Date: Sep 2012
Device: Kindle 3 wifi
I have to begin by thanking willus for this amazing piece of software. Have you considered adding a flattr.com button to the project website?

While converting papers to read on my Kindle, I ran across an odd case in which k2pdfopt recognizes a third region in a regular 2 col layout. This third region spans across the entire width along the bottom of the page.
The document I tried to convert is located here: www.cs.ucf.edu/~jjl/pubs/ICSE0289.pdf, page 8 is a good example.

I've played around a little with some of the options, am I right in assuming that -crgh is a key factor in this?
I am wondering, what to the green markers in between paragraphs in the -sm output mean exacly?
Can you give some pointers on what to tweak to achive a better result with the mentioned document?

Kind regards
Philipp

Last edited by erpel; 09-26-2012 at 05:23 PM.
erpel is offline   Reply With Quote
Old 09-26-2012, 09:35 PM   #176
willus
Fuzzball, the purple cat
willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.
 
willus's Avatar
 
Posts: 1,272
Karma: 11087488
Join Date: Jun 2011
Location: California
Device: iPad
Quote:
Originally Posted by erpel View Post
I ran across an odd case in which k2pdfopt recognizes a third region in a regular 2 col layout. This third region spans across the entire width along the bottom of the page.
The document I tried to convert is located here: www.cs.ucf.edu/~jjl/pubs/ICSE0289.pdf, page 8 is a good example.
Varying -crgh (specifically, making it larger) doesn't help in this case, but it should. I would call this a bug, actually. I'll look into it. For now, if you use -m 0.5 to chop off the page numbers, the columns are correctly separated all the way down.

Quote:
Originally Posted by erpel View Post
I am wondering, what to the green markers in between paragraphs in the -sm output mean exacly?
The green lines mark blank areas that are large enough that they are considered to separate contiguous regions. k2pdfopt will not try to wrap text across green lines, for example.

Thanks for the kind words. I had not heard of flattr.com--interesting concept. If people ask how to give me money I usually just tell them to give it to their favorite charity or some other open software project that needs it.
willus is offline   Reply With Quote
Old 09-27-2012, 03:16 AM   #177
saeed.geek
Junior Member
saeed.geek began at the beginning.
 
Posts: 3
Karma: 10
Join Date: Sep 2012
Device: Nokia 5233
Quote:
Originally Posted by willus View Post
I guess I'm not sure what you are asking. Do you want one output page for each input page, or two output pages for each input page? Either way, you should be able to adjust your device resolution settings: width (-w), height (-h), and output dpi (-odpi) to get what you want. You may want to turn wrapping off (-wrap-) and column detection off (-col 1) to prevent k2pdfopt from re-organizing your page--see my faq. I will add black on white processing to the wish list.
sorry for my bad english and thank you for your reply. I found the way. and please don't forget the night mode.
saeed.geek is offline   Reply With Quote
Old 09-27-2012, 08:52 AM   #178
willus
Fuzzball, the purple cat
willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.
 
willus's Avatar
 
Posts: 1,272
Karma: 11087488
Join Date: Jun 2011
Location: California
Device: iPad
Night mode

Quote:
Originally Posted by saeed.geek View Post
sorry for my bad english and thank you for your reply. I found the way. and please don't forget the night mode.
Did you want to invert the final output from k2pdfopt even if the source document is black on white, or did you want to process a source file that has white text on a dark background?
willus is offline   Reply With Quote
Old 09-28-2012, 04:39 AM   #179
saeed.geek
Junior Member
saeed.geek began at the beginning.
 
Posts: 3
Karma: 10
Join Date: Sep 2012
Device: Nokia 5233
Quote:
Originally Posted by willus View Post
Did you want to invert the final output from k2pdfopt even if the source document is black on white, or did you want to process a source file that has white text on a dark background?
I want the output regardless of input be white text on black background.
smartphones have backlight and in dark areas(like in bed) this backlight hurts the eyes. I have a smartphones then I want night mode( white text on black background) for these situations.
best regards.
saeed.geek is offline   Reply With Quote
Old 09-29-2012, 12:57 PM   #180
willus
Fuzzball, the purple cat
willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.
 
willus's Avatar
 
Posts: 1,272
Karma: 11087488
Join Date: Jun 2011
Location: California
Device: iPad
Quote:
Originally Posted by willus View Post
Quote:
Originally Posted by erpel View Post
While converting papers to read on my Kindle, I ran across an odd case in which k2pdfopt recognizes a third region in a regular 2 col layout.
Varying -crgh (specifically, making it larger) doesn't help in this case, but it should. I would call this a bug, actually. I'll look into it.
This was a bug in the way I was breaking up a region into vertical pieces: (it had to do with a combination of trying to keep captions with figures and this particular region being the last one on the page). I've fixed it for the next release.
willus is offline   Reply With Quote
Reply

Tags
ebook apps, k5 tools, kindle tools, kindle touch, tools

Thread Tools Search this Thread
Search this Thread:

Advanced Search

Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Viewing PDFs with another font Font PocketBook 4 11-12-2010 08:27 AM
Viewing Textbook PDFs... NJReader enTourage Archive 4 08-17-2010 05:17 PM
PRS-600 Restart bug while viewing PDFs? conundrum Sony Reader 2 03-04-2010 08:46 PM
More on viewing pdfs dso371 Bookeen 8 03-11-2008 07:15 PM
Viewing Untagged PDFs on Palm T|X Eroica Reading and Management 3 12-10-2007 01:44 PM


All times are GMT -4. The time now is 02:37 AM.


MobileRead.com is a privately owned, operated and funded community.