02-12-2015, 07:46 PM | #976 |
Guru
Posts: 829
Karma: 2525050
Join Date: Jun 2010
Device: K3W, PW4
|
I have for several years been using the Briss tool to manually massage a magazine available to me in PDF format for easy reading on my Kindle 3 WiFi. Most of the pages are in two column format, but a few are a singular and some have tables or images spanning both columns. Footnotes are also present in some of the articles, which I usually break out on their own. There are headers and footers added by the source of the PDF that I strip out. The PDF file is page images as opposed to anything more useful.
Is it possible for k2pdfopt to do this for me? If you would like to see a specific example I will be happy to provide you with an example magazine, before and after. In case it makes a difference, I am using linux mint 17.1 rebecca x64 here and have downloaded and installed the latest linux release. Dave Last edited by dhdurgee; 02-12-2015 at 07:47 PM. Reason: add detail |
02-12-2015, 10:12 PM | #977 |
Fuzzball, the purple cat
Posts: 1,272
Karma: 11087488
Join Date: Jun 2011
Location: California
Device: iPad
|
Yes, it's possible k2pdfopt might automate this process for you--it depends largely on the specifics of the magazine layout, how clean and consistent the pages are, etc., as to how satisfactory your results will be. You can post or PM me an example--that would be the most helpful.
|
02-13-2015, 09:27 AM | #978 | |
Guru
Posts: 829
Karma: 2525050
Join Date: Jun 2010
Device: K3W, PW4
|
Quote:
Dave |
|
02-13-2015, 10:46 AM | #979 | |
Fuzzball, the purple cat
Posts: 1,272
Karma: 11087488
Join Date: Jun 2011
Location: California
Device: iPad
|
Quote:
k2pdfopt -m .11in,.04in,.14in,.6in source.pdf Or, in the GUI, which you can run in Wine, set the crop margins (see attached). |
|
02-13-2015, 12:19 PM | #980 | |
Guru
Posts: 829
Karma: 2525050
Join Date: Jun 2010
Device: K3W, PW4
|
Quote:
Regarding further cropping, it appears from inspecting the original that although there is a consistent footer, except where I have inserted blanks to keep the even/odd pages correct, the header area that I crop out manually only appears at the beginning of articles. Thus the PDF pages are either first page of an article, subsequent pages of an article or inserted blanks. Is the tool up to such a detailed classification of pages? If so, perhaps this can be further cleaned up. I also notice that the table of contents got a bit mangled, but other than that a casual check seems to show a good job done on articles themselves. Thank you for your assistance with this. Dave |
|
02-13-2015, 01:37 PM | #981 | |
Fuzzball, the purple cat
Posts: 1,272
Karma: 11087488
Join Date: Jun 2011
Location: California
Device: iPad
|
Quote:
At this time you can use crop boxes (-cbox) to crop individual sets of pages differently, e.g. -cbox5,10,20-29 .11in,.04in,5.35in,8.99in Would crop pages 5, 10, and 20-29 starting at .11 inches from the left, 0.04 inches from the top, to a width x height of 5.35 in x 8.99 in. But for me it wouldn't be worth it go to that kind of trouble just for casual reading. If you want to get really fancy, you can use the -p option to only process certain source pages (different ways) and then re-assemble all of the converted parts with something like PDFtk or jpdftweak. But again, for me, it wouldn't be worth it for casual reading. |
|
02-13-2015, 03:41 PM | #982 |
Guru
Posts: 829
Karma: 2525050
Join Date: Jun 2010
Device: K3W, PW4
|
Does the -cbox option work in addition to the -m option? I am already using PDFtk to assemble a single PDF from the individual articles, so I would not have too much problem with determinine which pages have the extra header information that needs to be removed. Looking at the output I assume I should be somewhat conservative on this as you appear to be detecting white space margins for automatic cropping.
Dave |
02-13-2015, 05:19 PM | #983 | |
Fuzzball, the purple cat
Posts: 1,272
Karma: 11087488
Join Date: Jun 2011
Location: California
Device: iPad
|
Quote:
|
|
02-13-2015, 07:50 PM | #984 |
Guru
Posts: 829
Karma: 2525050
Join Date: Jun 2010
Device: K3W, PW4
|
I just took a look at the full command line documentation and I now am wondering if using your tool it might make sense to use it first on the inividual article PDF files and then use PDFtk to merge the processed files into a single PDF file. In that approach the top portion I would need to crop would be on the first page of each article. So I assume I would be able to use -cbox1 (specific figures) -m (specific figures) ./*.pdf as an argument to crop page one only and then marginalize the appropriate sections.
Dave |
02-13-2015, 08:00 PM | #985 | |
Fuzzball, the purple cat
Posts: 1,272
Karma: 11087488
Join Date: Jun 2011
Location: California
Device: iPad
|
Quote:
|
|
02-13-2015, 09:49 PM | #986 | |
Guru
Posts: 829
Karma: 2525050
Join Date: Jun 2010
Device: K3W, PW4
|
Quote:
I also tried adding -bpc 1, which gets the size down to a much more comparable figure. Given these are B&W or at most greyscale image scans can I expect any particular problems with this approach? On occasion there are photo images or artwork in the magazine. Is there any special provision to treat those differently, perhaps doing them at a higher bpc and keeping the bpc as 1 for text areas? Thanks again for your input on this. Dave |
|
02-14-2015, 01:29 AM | #987 | |
Fuzzball, the purple cat
Posts: 1,272
Karma: 11087488
Join Date: Jun 2011
Location: California
Device: iPad
|
Quote:
|
|
02-14-2015, 04:30 PM | #988 |
Guru
Posts: 829
Karma: 2525050
Join Date: Jun 2010
Device: K3W, PW4
|
I just gave the following a try, but the results were unexpected:
k2pdfopt -bpc 1 -cbox1 0in,.52in -m .11in,.04in,.14in,.6in Analog_2014-12-01.pdf Reading 4 pages from Analog_2014-12-01.pdf ... Detecting document orientation ... No rotation necessary. SOURCE PAGE 1 of 4 (5.7 x 9.6 in) ... 3 new pages saved. SOURCE PAGE 2 of 4 (5.6 x 9.0 in) ... 0 new pages saved. SOURCE PAGE 3 of 4 (5.7 x 9.0 in) ... 0 new pages saved. SOURCE PAGE 4 of 4 (5.6 x 9.0 in) ... 0 new pages saved. 4 pages written to Analog_2014-12-01_k2opt.pdf (0.1 MB). Note that only the first page was processed, I expected all four pages to be processed with the first page only having just the top .52in cropped off before further processing. What did I miss? Dave |
02-14-2015, 08:18 PM | #989 | |
Fuzzball, the purple cat
Posts: 1,272
Karma: 11087488
Join Date: Jun 2011
Location: California
Device: iPad
|
Quote:
-cbox2- 0,0 |
|
02-15-2015, 07:53 AM | #990 | |
Guru
Posts: 829
Karma: 2525050
Join Date: Jun 2010
Device: K3W, PW4
|
Quote:
Any idea why this is happening? I guess I can work around it with a two pass process, but it is strange. Is this a bug that needs fixing? Dave Last edited by dhdurgee; 02-15-2015 at 07:54 AM. Reason: fix typo |
|
Tags |
ebook apps, k5 tools, kindle tools, kindle touch, tools |
Thread Tools | Search this Thread |
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
Viewing PDFs with another font | Font | PocketBook | 4 | 11-12-2010 08:27 AM |
Viewing Textbook PDFs... | NJReader | enTourage Archive | 4 | 08-17-2010 05:17 PM |
PRS-600 Restart bug while viewing PDFs? | conundrum | Sony Reader | 2 | 03-04-2010 08:46 PM |
More on viewing pdfs | dso371 | Bookeen | 8 | 03-11-2008 07:15 PM |
Viewing Untagged PDFs on Palm T|X | Eroica | Reading and Management | 3 | 12-10-2007 01:44 PM |