Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Formats > Other formats > IMP

Notices

Reply
 
Thread Tools Search this Thread
Old 04-04-2007, 01:28 PM   #31
ashkulz
Addict
ashkulz will become famous soon enoughashkulz will become famous soon enoughashkulz will become famous soon enoughashkulz will become famous soon enoughashkulz will become famous soon enoughashkulz will become famous soon enoughashkulz will become famous soon enough
 
ashkulz's Avatar
 
Posts: 350
Karma: 705
Join Date: Dec 2006
Location: Mumbai, India
Device: Kindle 1/REB 1200
Quote:
Well, it is not really a cropping problem in itself. As one can see here, the png's look just fine. However, when I open the imp file (either on EB-1150 or on the computer) the scroll bar of the EB-1150 at the bottom of the screen overlaps a little bit with the image and the first letters of the words in the text cannot be seen.
Hmm, can you run it manually and try specifying a lower hres? ie. run the command
Code:
<install-dir>\pdfread-run.cmd -p eb1150 --hres 454 <pdf-file>
. Can you please experiment and let me know what is the optimal resolution at which it doesn't get cut? If you do so, I will update the profile in the next release.

Quote:
Also, in some pages, the last line of text in the page is being "cut" by the crop, and becomes completely illegible, as one can see here . One possible solution that I can see is to crop the pages so that the bottom of one page would overlap with the top of the next page. In case the text is being cropped and the last line is illegible, this solution would allow one to read the line in question on the next page. Of course, this would probably generate some redundancies, but it's preferable to have the same line of text twice than not to have it at all.
Well, that is already implemented! If you look closely, over 3 lines of text are overlapping between pages. The line that is cropped on page 1 already appears as the 3rd line on page 2. The amount of overlap is controlled by the "overlap" parameter, which is set to 45 pixels in the eb1150 profile.
ashkulz is offline   Reply With Quote
Old 04-04-2007, 01:33 PM   #32
ashkulz
Addict
ashkulz will become famous soon enoughashkulz will become famous soon enoughashkulz will become famous soon enoughashkulz will become famous soon enoughashkulz will become famous soon enoughashkulz will become famous soon enoughashkulz will become famous soon enough
 
ashkulz's Avatar
 
Posts: 350
Karma: 705
Join Date: Dec 2006
Location: Mumbai, India
Device: Kindle 1/REB 1200
Quote:
Actually that was it, you have to reinstall Ebook Publisher AFTER you install pdfread.
Works great now!
Actually, I would guess that you have GEB Librarian installed and you installed it AFTER eBook Publisher. GEB Librarian installs an older version of the SBPubX library which is used for IMP creation. Naturally, when you reinstalled it registered the newer version, which made the problem go away :-)
ashkulz is offline   Reply With Quote
Advert
Old 04-04-2007, 01:36 PM   #33
ashkulz
Addict
ashkulz will become famous soon enoughashkulz will become famous soon enoughashkulz will become famous soon enoughashkulz will become famous soon enoughashkulz will become famous soon enoughashkulz will become famous soon enoughashkulz will become famous soon enough
 
ashkulz's Avatar
 
Posts: 350
Karma: 705
Join Date: Dec 2006
Location: Mumbai, India
Device: Kindle 1/REB 1200
Quote:
Originally Posted by sammykrupa
Okay, I am messing up horribly now. I put the 'elementtree' folder in with the pdfread.py file, and that didn't make those "Element" warnings go away.

Also, there were no PNG files in the temp directory. And I couldn't figure out how to make that change in the code, I keep getting weird errors. It would probably be best for you to upload the changed code for me to try.

Sorry to bother you with this!
Sam Krupa
Okay, here you go. This goes a bit overboard, and prints the result of each command, so you should get a lot of debugging output. I suspect that Ghostscript is giving you an error, can you try with the sample PDF I have posted also?
Attached Files
File Type: txt pdfread_py.txt (16.1 KB, 795 views)
ashkulz is offline   Reply With Quote
Old 04-04-2007, 02:11 PM   #34
sammykrupa
Reader of the Reader
sammykrupa doesn't littersammykrupa doesn't litter
 
Posts: 103
Karma: 107
Join Date: Apr 2006
Device: Sony Reader PRS-500
PDFRead Errors

Ashkulz!
I am reporting back with some juicy output from PDFRead!

The output is included in the attached text file.
Attached Files
File Type: txt output.txt (24.0 KB, 1122 views)
sammykrupa is offline   Reply With Quote
Old 04-04-2007, 02:31 PM   #35
sputnik
Enthusiast
sputnik puts his or her pants on both legs at a time.sputnik puts his or her pants on both legs at a time.sputnik puts his or her pants on both legs at a time.sputnik puts his or her pants on both legs at a time.sputnik puts his or her pants on both legs at a time.sputnik puts his or her pants on both legs at a time.sputnik puts his or her pants on both legs at a time.sputnik puts his or her pants on both legs at a time.sputnik puts his or her pants on both legs at a time.sputnik puts his or her pants on both legs at a time.sputnik puts his or her pants on both legs at a time.
 
sputnik's Avatar
 
Posts: 46
Karma: 133388
Join Date: Mar 2007
Location: London, Ontario
Device: EB 1150, iLiad
Quote:
Originally Posted by ashkulz
Hmm, can you run it manually and try specifying a lower hres? ie. run the command
Code:
<install-dir>\pdfread-run.cmd -p eb1150 --hres 454 <pdf-file>
. Can you please experiment and let me know what is the optimal resolution at which it doesn't get cut? If you do so, I will update the profile in the next release.

I run in windows this command C:\PDFRead\pdfread-run.cmd -p eb1150 --hres 454 C:\Documents and Settings\Owner\Desktop\tempo\cheyne.pdf and then the attached dos window appears. As you can see, I do not know too much about how to run a program manually. The hres specified in the command does not show up in the resulting dos window. Could you please enumerate the steps that I have to follow so that I can run the program manually and experiment with different hres values? I tried changing the value for hres in pdfread.py, but no result.
Attached Images
File Type: bmp screen.bmp (44.4 KB, 988 views)

Last edited by sputnik; 04-04-2007 at 03:02 PM.
sputnik is offline   Reply With Quote
Advert
Old 04-04-2007, 03:47 PM   #36
ashkulz
Addict
ashkulz will become famous soon enoughashkulz will become famous soon enoughashkulz will become famous soon enoughashkulz will become famous soon enoughashkulz will become famous soon enoughashkulz will become famous soon enoughashkulz will become famous soon enough
 
ashkulz's Avatar
 
Posts: 350
Karma: 705
Join Date: Dec 2006
Location: Mumbai, India
Device: Kindle 1/REB 1200
Quote:
Originally Posted by sputnik
I run in windows this command C:\PDFRead\pdfread-run.cmd -p eb1150 --hres 454 C:\Documents and Settings\Owner\Desktop\tempo\cheyne.pdf and then the attached dos window appears.
The problem seems to be that you haven't put the file in quotes. So try putting the file somewhere in C:\ with no spaces, or try running with quotes
Code:
C:\PDFRead\pdfread-run.cmd -p eb1150 --hres 454 "C:\Documents and Settings\Owner\Desktop\tempo\cheyne.pdf"
ashkulz is offline   Reply With Quote
Old 04-04-2007, 03:59 PM   #37
ashkulz
Addict
ashkulz will become famous soon enoughashkulz will become famous soon enoughashkulz will become famous soon enoughashkulz will become famous soon enoughashkulz will become famous soon enoughashkulz will become famous soon enoughashkulz will become famous soon enough
 
ashkulz's Avatar
 
Posts: 350
Karma: 705
Join Date: Dec 2006
Location: Mumbai, India
Device: Kindle 1/REB 1200
Quote:
Originally Posted by sammykrupa
Ashkulz!
I am reporting back with some juicy output from PDFRead!

The output is included in the attached text file.
Okay, got the problem. It seems that it is due to incorrect version of xpdf. I used 3.01, which has the option -pagecrop but which is not present in 3.00 which you have. Thus, the pdftops program printed a help message and did not create a postscript file, and ghostscript died with "Error: /undefinedfilename in (page.eps)". So you have two options:
  • Upgrade to 3.01, which will be a problem as I don't see a binary anywhere;
  • Use the file I have attached which does not use that option.

Note that I am unsure that I will make this fix in the new release, as I do not know what the effect will be if I leave out this option (it treats the CropBox as the page size, removing unnecessary whitespace and/or typesetting). In general, I would recommend updating to a manually-compiled 3.01.

Please let me know if it works, and document the steps you took to achieve it so everyone else who uses OS X can also benefit from it :-)
Attached Files
File Type: txt pdfread_py.txt (16.1 KB, 787 views)
ashkulz is offline   Reply With Quote
Old 04-04-2007, 04:36 PM   #38
sammykrupa
Reader of the Reader
sammykrupa doesn't littersammykrupa doesn't litter
 
Posts: 103
Karma: 107
Join Date: Apr 2006
Device: Sony Reader PRS-500
So close

Almost there!

There is now a PNG file in the temp directory, but I get this error:

Unable to determine total number of pages in PDF
Please enter total page count: 5

Temporary directory: /tmp/pdfread-6ktemb

Page 1/5: EXTRACT RASTERIZE CROP DILATE RESIZE
Please check that ImageMagick is installed.


I can run the 'convert' command on my system, so I do not know what is up.

Bummer. I hope helping me isn't too much of a problem.

Sam Krupa
sammykrupa is offline   Reply With Quote
Old 04-04-2007, 05:44 PM   #39
sputnik
Enthusiast
sputnik puts his or her pants on both legs at a time.sputnik puts his or her pants on both legs at a time.sputnik puts his or her pants on both legs at a time.sputnik puts his or her pants on both legs at a time.sputnik puts his or her pants on both legs at a time.sputnik puts his or her pants on both legs at a time.sputnik puts his or her pants on both legs at a time.sputnik puts his or her pants on both legs at a time.sputnik puts his or her pants on both legs at a time.sputnik puts his or her pants on both legs at a time.sputnik puts his or her pants on both legs at a time.
 
sputnik's Avatar
 
Posts: 46
Karma: 133388
Join Date: Mar 2007
Location: London, Ontario
Device: EB 1150, iLiad
Quote:
Originally Posted by ashkulz
The problem seems to be that you haven't put the file in quotes. So try putting the file somewhere in C:\ with no spaces, or try running with quotes
Code:
C:\PDFRead\pdfread-run.cmd -p eb1150 --hres 454 "C:\Documents and Settings\Owner\Desktop\tempo\cheyne.pdf"
hres 445 solved the problem and it also looks good (hres 448 also solves the problem (barely), but i prefer 445). Also, -- overlap 10 works better than -- overlap 45 (at least for smaller fonts), as there is almost no redundancy.

Last edited by sputnik; 04-04-2007 at 05:52 PM.
sputnik is offline   Reply With Quote
Old 04-04-2007, 07:12 PM   #40
kgian
Enthusiast
kgian began at the beginning.
 
Posts: 31
Karma: 10
Join Date: Apr 2007
Device: EBW-1150
I agree with sputnik for hres 445 and overlap 10. With these options everything looks fine!

So, for example, the right command should be from the command prompt:

pdfread-run.cmd -p eb1150 -o c:\02\kos.imp --hres 445 --overlap=10 c:\02\kos.pdf


for a file named kos.pdf in the c:\02 directory.
kgian is offline   Reply With Quote
Old 04-04-2007, 10:11 PM   #41
alex_d
Addict
alex_d doesn't litteralex_d doesn't litter
 
Posts: 303
Karma: 187
Join Date: Dec 2006
Device: Sony Reader
ashkulz, why don't you do what I do and bundle all supporting programs with the script? Maybe python can't be bundled, but at least bundle xpdf so that "no, don't use 3.00, use manually compiled 3.01" could be avoided.

The stuff that you did with auto-adjusting autocropping sounds cool. I haven't taken apart your stuff yet. How do you do it? You do the cropping directly with xpdf or you rasterize and then use image tools? How do you make measurements/calculations? Also, i'm confused... do you use ghostcript or xpdf? Could you maybe post a quick summary of your toolchain either here or on the "pythonized pdfrasterfarian" thread?
alex_d is offline   Reply With Quote
Old 04-04-2007, 11:40 PM   #42
ashkulz
Addict
ashkulz will become famous soon enoughashkulz will become famous soon enoughashkulz will become famous soon enoughashkulz will become famous soon enoughashkulz will become famous soon enoughashkulz will become famous soon enoughashkulz will become famous soon enough
 
ashkulz's Avatar
 
Posts: 350
Karma: 705
Join Date: Dec 2006
Location: Mumbai, India
Device: Kindle 1/REB 1200
Quote:
Originally Posted by sammykrupa
Almost there!

There is now a PNG file in the temp directory, but I get this error:

Unable to determine total number of pages in PDF
Please enter total page count: 5

Temporary directory: /tmp/pdfread-6ktemb

Page 1/5: EXTRACT RASTERIZE CROP DILATE RESIZE
Please check that ImageMagick is installed.


I can run the 'convert' command on my system, so I do not know what is up.

Bummer. I hope helping me isn't too much of a problem.

Sam Krupa
Hey, no problem -- You're the only "user" who is on OS X and interested enough to try it on that platform, so I have to keep you happy

Okay, I discovered that there are a few bugs on Linux caused by a workaround I made for Windows. Will have to see how to fix them, but in the meanwhile can you try the attached version?
Attached Files
File Type: txt pdfread_py.txt (16.1 KB, 791 views)
ashkulz is offline   Reply With Quote
Old 04-04-2007, 11:44 PM   #43
ashkulz
Addict
ashkulz will become famous soon enoughashkulz will become famous soon enoughashkulz will become famous soon enoughashkulz will become famous soon enoughashkulz will become famous soon enoughashkulz will become famous soon enoughashkulz will become famous soon enough
 
ashkulz's Avatar
 
Posts: 350
Karma: 705
Join Date: Dec 2006
Location: Mumbai, India
Device: Kindle 1/REB 1200
Quote:
Originally Posted by kgian
I agree with sputnik for hres 445 and overlap 10. With these options everything looks fine!
Okay, I will change the hres to 445 in the new version I will be releasing by Saturday. I don't want to reduce the overlap to 10, as it's too small if the font is larger. I'll reduce it to 25, and will provide an option in the GUI where you can reduce it to whatever you prefer.
ashkulz is offline   Reply With Quote
Old 04-04-2007, 11:54 PM   #44
ashkulz
Addict
ashkulz will become famous soon enoughashkulz will become famous soon enoughashkulz will become famous soon enoughashkulz will become famous soon enoughashkulz will become famous soon enoughashkulz will become famous soon enoughashkulz will become famous soon enough
 
ashkulz's Avatar
 
Posts: 350
Karma: 705
Join Date: Dec 2006
Location: Mumbai, India
Device: Kindle 1/REB 1200
Quote:
ashkulz, why don't you do what I do and bundle all supporting programs with the script? Maybe python can't be bundled, but at least bundle xpdf so that "no, don't use 3.00, use manually compiled 3.01" could be avoided.
Well, I *do* bundle it for Windows (see the installer). sammykrupa is on OS X, which is why all this is required -- I don't have access to OS X, and installing private versions of tools is very hard to implement on non-Windows systems, and not recommended at all. It's much better to let the native package management system handle the installation and upgrade process for the individual tools.

Quote:
The stuff that you did with auto-adjusting autocropping sounds cool. I haven't taken apart your stuff yet. How do you do it? You do the cropping directly with xpdf or you rasterize and then use image tools? How do you make measurements/calculations? Also, i'm confused... do you use ghostcript or xpdf? Could you maybe post a quick summary of your toolchain either here or on the "pythonized pdfrasterfarian" thread?
Okay, will answer one by one:
  • I use xpdf to convert from pdf -> ps, and then rasterize that from Ghostscript
  • I don't use the Ghostscript cropbox detection at all, though you can enable it by --gscrop. I had problems with it when the PDF already had a CropBox which covered prepress marks.
  • I use the PIL to detect all the "white" space surrounding an image, and directly crop that. This is very fast and accurate -- unless you've got a scan (where there may be some noise) it will remove all of the whitespace (even more than what is detected by Ghostscript). I plan to add noise elemination and more agressive cropping (similiar to what curiouser did) soon.

I'll mention the technical details in the other thread.
ashkulz is offline   Reply With Quote
Old 04-05-2007, 06:22 AM   #45
sammykrupa
Reader of the Reader
sammykrupa doesn't littersammykrupa doesn't litter
 
Posts: 103
Karma: 107
Join Date: Apr 2006
Device: Sony Reader PRS-500
Ashkulz,
Your magic seems to have made the ImageMagick error disappear, but that original pesky error we where getting before still exists:

p$ python pdfread.py -p prs500 sample.pdf
Unable to determine total number of pages in PDF
Please enter total page count: 5

Temporary directory: /tmp/pdfread-tCVV7p

Page 1/5: EXTRACT RASTERIZE CROP DILATE RESIZE DONE
Page 2/5: EXTRACT RASTERIZE CROP DILATE RESIZE DONE
Page 3/5: EXTRACT RASTERIZE CROP DILATE RESIZE DONE
Page 4/5: EXTRACT RASTERIZE CROP DILATE RESIZE DONE
Page 5/5: EXTRACT RASTERIZE CROP DILATE RESIZE DONE
Traceback (most recent call last):
File "/Users/mikekrup/Desktop/pdfread-v4-src/src/pdfread.py", line 496, in ?
PdfConverter().main()
File "/Users/mikekrup/Desktop/pdfread-v4-src/src/pdfread.py", line 319, in main
delete = self.FORMATS[self.options.format](self)
File "/Users/mikekrup/Desktop/pdfread-v4-src/src/pdfread.py", line 260, in generate_lrf
from pylrs.pylrs import Book, PageStyle, BlockStyle, ImageStream, BlockSpace, ImageBlock
File "/Users/mikekrup/Desktop/pdfread-v4-src/src/pylrs/pylrs.py", line 11
from elementtree.ElementTree import (Element, SubElement)
^
SyntaxError: invalid syntax


I have the folder called "elmenttree" in with the pdfread.py file.

Sam Krupa
sammykrupa is offline   Reply With Quote
Reply


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
In addition to your eBookwise 1150, how many also own different ebook readers? nrapallo Fictionwise eBookwise 40 04-07-2009 10:13 AM
Comparing 1150 to 1200 or 2150 Katelyn Fictionwise eBookwise 1 04-29-2007 11:38 AM
PDFRead - reading PDFs on eBook Readers ashkulz Sony Reader 19 04-29-2007 11:28 AM
eBookwise-1150 or older Palm for ebook reading Katelyn Fictionwise eBookwise 5 11-22-2006 08:32 PM


All times are GMT -4. The time now is 05:28 PM.


MobileRead.com is a privately owned, operated and funded community.