01-09-2009, 01:02 AM | #46 |
GuteBook/Mobi2IMP Creator
Posts: 2,958
Karma: 2530691
Join Date: Dec 2007
Location: Toronto, Canada
Device: REB1200 EBW1150 Device: T1 NSTG iLiad_v2 NC Device: Asus_TF Next1 WPDN
|
|
01-10-2009, 03:01 AM | #47 |
Wizard
Posts: 3,671
Karma: 12205348
Join Date: Mar 2008
Device: Galaxy S, Nook w/CM7
|
|
01-17-2009, 11:07 PM | #48 | ||
Member
Posts: 19
Karma: 72
Join Date: Dec 2008
Device: bebook
|
First of all ... Thanks for this wonderful application. I've used it many times now. And it works great. I use it for most of the technical papers I need to read for work.
As Ulysses posted before, it does crash sometimes. For me, that happens most of the time with very big documents. (I do use PaperCrop regularly on documents with 500 or more pages) PaperCrop seems to keep everything in memory before creating the PDF output. It basically runs out of memory before it can finish. I didn't have the time to look at the source code yet, but it could be an idea to store the page images while processing and get them from disk while producing the PDF? Since the conversion of such big PDF books results in PDF books with 1000 pages or more, this also becomes a problem for me and my poor BeBook with its limited memory. To be able to split these big books into multiple smaller ones, I altered the 'config.lua' file. (which - if everything went fine - should be attached) to make PaperCrop output a new PDF after every 100 pages. You can change this number at the top of the file in the line that reads Code:
nr_of_pages_per_pdf_book = 100; Maybe it can help you too Ulysses ? Quote:
Quote:
Last edited by wiffel; 01-17-2009 at 11:09 PM. |
||
01-18-2009, 04:47 PM | #49 |
Member
Posts: 19
Karma: 72
Join Date: Dec 2008
Device: bebook
|
If anybody is interested ...
I did implement what I proposed in my previous post. (Save the images and load them later to create the PDF file(s) ). This makes the usage of memory for the conversion of large files a lot less. The config.lua to do this is attached. PS: Taesoo Kwon While implementing this, I did add the LUA collectgarbage a couple of times to make sure that images are garbage collected. I noticed that I did have a lot of crashes while generating the PDF. This makes me think that the PDF structure (used by outpdf:addPage(image)) does not keep a reference to the images that have been added. That could account for the 'random' crashes. Just to be sure, I did add a little list that keeps a reference to the images until the PDF has been created. That way they can not be garbage collected by LUA. That seemed to fix the crashes for me. |
01-21-2009, 11:44 AM | #50 |
Wizard
Posts: 3,671
Karma: 12205348
Join Date: Mar 2008
Device: Galaxy S, Nook w/CM7
|
Great idea thank you. My BB Storm also has a heck of a time loading large PDF.
I tried to run your script and got this error. "lua error config.lua 86: attempt to compare nil with number" =X= |
01-22-2009, 07:40 AM | #51 |
Member
Posts: 19
Karma: 72
Join Date: Dec 2008
Device: bebook
|
Hi =X=,
Did you cut-copy-paste part of the code into your config.lua? My latest version does not have any code on line 86 (except for the END statement). So, it's hard for me to see what is going wrong. Wiffel |
01-22-2009, 01:25 PM | #52 | |
Wizard
Posts: 3,671
Karma: 12205348
Join Date: Mar 2008
Device: Galaxy S, Nook w/CM7
|
Quote:
I've re-ran the tool with your unmodified config.lua The problem code is line 81. Color is red. Msg: Code:
"lua error config.lua:81:attempting to compare nil with a number" Here is the error Code:
function outputImage(image, outdir, pageNo, rectNo)
if output_to_pdf then ----if output_to_pdf and outpdf:isValid() then
--vv--outpdf:addPage(image)
if (book_pages.nr_of_pages < nr_of_pages_per_pdf_book) then
book_pages:add_page(image, outdir);
else
book_pages:writeToFile(outdir);
book_pages:init_for_next_part();
end
--^^--
else
image:Save(string.format("%s/%05d_%03d%s",outdir,pageNo,rectNo,output_format))
end
end
|
|
01-22-2009, 06:12 PM | #53 |
Member
Posts: 19
Karma: 72
Join Date: Dec 2008
Device: bebook
|
=X=,
Sure that you still have a line like nr_of_pages_per_pdf_book = 100; somewhere ? If that is the case, it can't hurt to put a line like book_pages:init(1); just before the line function outputImage(image, outdir, pageNo, rectNo) Anyway it's strange. The file works fine for me. Good luck, Wilfried |
01-23-2009, 10:50 AM | #54 |
Wizard
Posts: 2,608
Karma: 3000161
Join Date: Jan 2009
Device: Kindle PW3 (wifi)
|
Hi,
I converted a two-columns pdf book of 144 pages (that is really a 288 pages book) using papercrop. It went up from one meg to 18 megs. On Linux, I used imagemagick to batch reduce the size of the images by 50"percent to get about a 6 or 7 megs file but divided in 144 numbered fragments. I declared it to be a png file and it processed the whole lot.-mogrify -resize 50% *.jpg- After that, grouping all this in a zip file and converting to lrf with comic2lrf. This is a lot of work, on two platforms...but the result is readable and not too heavy. |
01-23-2009, 12:00 PM | #55 | |
Enthusiast
Posts: 27
Karma: 163
Join Date: Nov 2008
Device: Kobo wifi
|
Quote:
Thank you. By the way, if you want, I can make you (or anybody who whats) as a member of the googlecode project page, so that you can update the source codes/binary directly, or upload your versions of papercrop binary there so that people can choose which version to use. (whenever I update a version, it seems that a new bug is always introduced..) P.S. As far as I understand the libharu library and my PDFWriter class, outpdf:addPage doesn't keep a pointer to the image, and you can discard the image as soon as you call the addPage funtion. But I cannot figure out why such kind of problem happened. |
|
01-23-2009, 02:34 PM | #56 |
Wizard
Posts: 3,671
Karma: 12205348
Join Date: Mar 2008
Device: Galaxy S, Nook w/CM7
|
|
01-23-2009, 05:16 PM | #57 |
Addict
Posts: 242
Karma: 177
Join Date: Nov 2007
Location: Amsterdam
Device: sony 505
|
Hi
Nice little program I'm having a problem with the bottom two lines repeating on the following page Any ideas what causes this? And how do I change font size? |
01-23-2009, 06:01 PM | #58 |
Addict
Posts: 242
Karma: 177
Join Date: Nov 2007
Location: Amsterdam
Device: sony 505
|
Oke I changed the scroll overlap to: 0
scroll_overlap_pixels=0 and that got rid of most of the problem the only thing is that the top quarter of the letters on the following page are at the bottom of the previous one. this progresses through the document so that after a few pages the letters are cut in half is there a way to fix this like give the scroll overlap a - value ? Last edited by mazzeltjes; 01-23-2009 at 06:44 PM. |
01-27-2009, 03:10 PM | #59 | |
useR!
Posts: 299
Karma: 651
Join Date: Nov 2007
Location: NY
Device: Onyx Boox Max 2, Kobo Libra H2O, iRiver Story HD
|
Quote:
1. Convert each page into one long image 1a. First, need to edit 'config.lua' to increase 'device_height' option. You need to do this only once. For example, I use the following. Code:
device_width=700 device_height=30000 scroll_overlap_pixels=0 1c. Press "Process current page" in each page 2. Archive all images into one ZIP file 3. Process by PDFLRF using Portrait or Comic-Portrait with Smart-cut option on. I use a simple batch file such as Code:
pdflrf --erode=2 --nocrop -rs -c 8 --rotation="0" --pad=10 --overlap=0 -i %1 -o "%~n1.lrf" -t %2 -a %3 If this batch file is named as 'plb.bat', then you can use the following command to convert the zip file from step 2. Code:
plb 'filename.zip' 'Title of the book' 'Author of the book' It is too bad that PDFLRF is not open-sourced. If it were, smart crop algorithm could have been incorporated in programs like PaperCrop rather easily. Last edited by soilwork; 01-29-2009 at 02:45 PM. Reason: Modified the batch file with less arguments |
|
01-28-2009, 10:14 AM | #60 |
Addict
Posts: 242
Karma: 177
Join Date: Nov 2007
Location: Amsterdam
Device: sony 505
|
Thanx Soilwork
I'll give that a shot Looks like a lot of work but might be worth it |
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
Q: multi-column PDF to single column mobi format converstion | auburn1975 | Calibre | 7 | 01-28-2012 06:11 PM |
eBook PDF - free tool for creating PDF eBooks from text files | KACartlidge | 6 | 01-04-2012 09:41 AM | |
Multi column sort? | nexus100 | Calibre | 1 | 07-11-2010 11:19 PM |
Multi-column articles in PDF | tdido | OpenInkpot | 7 | 06-30-2009 11:13 AM |