Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Formats > PDF

Notices

Reply
 
Thread Tools Search this Thread
Old 04-27-2007, 11:16 AM   #16
ashkulz
Addict
ashkulz will become famous soon enoughashkulz will become famous soon enoughashkulz will become famous soon enoughashkulz will become famous soon enoughashkulz will become famous soon enoughashkulz will become famous soon enoughashkulz will become famous soon enough
 
ashkulz's Avatar
 
Posts: 350
Karma: 705
Join Date: Dec 2006
Location: Mumbai, India
Device: Kindle 1/REB 1200
Quote:
Originally Posted by gdxf
I followed the batch mode instructions to run batch conversion in windows, but had encountered this notice in the command line:

"Unable to determine total number of pages in document
Please enter number of pages: "

When I put in a page number, it results in a blank lrf file.

Here is what the screen says:

"Unable to determine total number of pages in document
Please enter number of pages: 1

Temporary directory: c:\docume~1........

Page 1/1: EXTRACT RASTERIZE BLANK

Creating BBeB file ... done.
That's a very weird error, it usually results when your installation has not been set up correctly. Can you check the following:
  • Check whether you can convert PDF files normally via the GUI
  • Try the attached script with same instructions
  • Check that the PDFRead location is set correctly (set LOC=)
  • Uncomment the commented call in the file and try it again and send me the output.
  • zip up the directory and attach it here or send it to me
Attached Files
File Type: txt pdfread-batch_bat.txt (610 Bytes, 1151 views)
ashkulz is offline   Reply With Quote
Old 04-27-2007, 11:25 AM   #17
ashkulz
Addict
ashkulz will become famous soon enoughashkulz will become famous soon enoughashkulz will become famous soon enoughashkulz will become famous soon enoughashkulz will become famous soon enoughashkulz will become famous soon enoughashkulz will become famous soon enough
 
ashkulz's Avatar
 
Posts: 350
Karma: 705
Join Date: Dec 2006
Location: Mumbai, India
Device: Kindle 1/REB 1200
Quote:
Originally Posted by kovidgoyal
Also, this is my first time rasterizing a PDF (I usually have access to the LaTeX sources). Is the font rasterization always so bad? I've attached samples to show you what I mean.
I don't have Sony Reader, so I can't really see how the generated LRF looks. On the other hand, the converted PDF did look decent when I looked at the PNG. Do you have any particular points that felt really bad? I'm always interested in knowing where I can improve things...
ashkulz is offline   Reply With Quote
Advert
Old 04-27-2007, 12:32 PM   #18
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 43,858
Karma: 22666666
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
You can install the connect reader software and use that to see how the files look. Basically the fonts look like they've been reasterized without any antialiasing.
kovidgoyal is offline   Reply With Quote
Old 04-27-2007, 12:42 PM   #19
ashkulz
Addict
ashkulz will become famous soon enoughashkulz will become famous soon enoughashkulz will become famous soon enoughashkulz will become famous soon enoughashkulz will become famous soon enoughashkulz will become famous soon enoughashkulz will become famous soon enough
 
ashkulz's Avatar
 
Posts: 350
Karma: 705
Join Date: Dec 2006
Location: Mumbai, India
Device: Kindle 1/REB 1200
Uhm, I don't have access to a Windows PC at home ... so if you could post some screenshots I'd be grateful. But yes, the fonts do look a bit ragged ... what happens is that I render at 300dpi (anti-aliased), perform dilation at that resolution and then reduce the size. Now, as a result of this anti-aliasing happens with the reduced image, which is bad because when you downsample it to 4 colors you can get "gaps" where the color information is lost due to the 2-bit grayscale limitation. As far as I know, even RasterFarian has pretty much the same output. Can you try with that and see how good the result is?

BTW, can you try again with 1.7? I replaced imagemagick with pngnq, this may give better output...

Last edited by ashkulz; 04-27-2007 at 12:49 PM.
ashkulz is offline   Reply With Quote
Old 04-28-2007, 10:58 AM   #20
Gravitas
Muppet
Gravitas doesn't litterGravitas doesn't litter
 
Gravitas's Avatar
 
Posts: 123
Karma: 107
Join Date: Apr 2007
Location: Nottingham, England, UK
Device: Zen Vision :M / Nokia 5800 musicXpress / Sony PRS500
1.7 fixed the problems I was having, now works like a dream. Thanks
Gravitas is offline   Reply With Quote
Advert
Old 04-28-2007, 11:13 AM   #21
ashkulz
Addict
ashkulz will become famous soon enoughashkulz will become famous soon enoughashkulz will become famous soon enoughashkulz will become famous soon enoughashkulz will become famous soon enoughashkulz will become famous soon enoughashkulz will become famous soon enough
 
ashkulz's Avatar
 
Posts: 350
Karma: 705
Join Date: Dec 2006
Location: Mumbai, India
Device: Kindle 1/REB 1200
Quote:
Originally Posted by Gravitas
The text is not as clear as non-pdf converted documents, but is perfectly readable so long as I up the font size to medium. I may try the same document again with the pngs optimized to see if that improves the text any, but I'm happy with how it is at the moment.
Well, that's a side-effect of having native font rendering, and putting up with something that is rasterized from PDFs which target a much higher DPI. Also, PNG optimization will try to reduce the file size, not any of the display parameters! You may want to experiment with the DPI and/or edge enhancement level to find what looks best. I don't have a reader, so I don't know whether the default settings I've chosen are equally good for the reader.
ashkulz is offline   Reply With Quote
Old 04-28-2007, 11:21 AM   #22
ashkulz
Addict
ashkulz will become famous soon enoughashkulz will become famous soon enoughashkulz will become famous soon enoughashkulz will become famous soon enoughashkulz will become famous soon enoughashkulz will become famous soon enoughashkulz will become famous soon enough
 
ashkulz's Avatar
 
Posts: 350
Karma: 705
Join Date: Dec 2006
Location: Mumbai, India
Device: Kindle 1/REB 1200
Okay, I'm planning to release 1.8 in a day or two. The major feature planned would be an all-color pipeline (with option to downsample to grayscale, of course). This won't be of much use to anyone except people who own the REB 1200 (ie. me ) and those who get those newfangled color e-ink readers.

Some previews of things look in color: raw page, dilated page, and after color reduction. Regular text pages also work as they used to: raw text page and the dilated text.

Do any of you have any feature requests for 1.8? I don't feel comfortable with such short releases where only a few new things are added ...
ashkulz is offline   Reply With Quote
Old 04-28-2007, 06:41 PM   #23
gdxf
Enthusiast
gdxf began at the beginning.
 
Posts: 48
Karma: 27
Join Date: Oct 2006
Device: Sony Reader PRS-500
I used your batch file and changed the batch file conversion directory from "My Desktop" to another drive on my computer. It works! I guess there might be some restriction of user access issue involved, but I am not sure about that.

Some files are converted with no problem, others are still with this annoying "unable to determine total number of pages" problem. I later find that those files that cannot be converted include: 1. pdf files with OCR text underneath the image, 2. pdf files with non-alphabet file names. Hope it can be dealt with in later releases.
gdxf is offline   Reply With Quote
Old 04-28-2007, 08:26 PM   #24
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 43,858
Karma: 22666666
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
Quote:
Originally Posted by ashkulz
Uhm, I don't have access to a Windows PC at home ... so if you could post some screenshots I'd be grateful. But yes, the fonts do look a bit ragged ... what happens is that I render at 300dpi (anti-aliased), perform dilation at that resolution and then reduce the size. Now, as a result of this anti-aliasing happens with the reduced image, which is bad because when you downsample it to 4 colors you can get "gaps" where the color information is lost due to the 2-bit grayscale limitation. As far as I know, even RasterFarian has pretty much the same output. Can you try with that and see how good the result is?

BTW, can you try again with 1.7? I replaced imagemagick with pngnq, this may give better output...
I'm travelling but I'll do some experimentation when I return. I highly recommend vmware and an old windows installation disk.
kovidgoyal is offline   Reply With Quote
Old 04-29-2007, 12:25 AM   #25
ashkulz
Addict
ashkulz will become famous soon enoughashkulz will become famous soon enoughashkulz will become famous soon enoughashkulz will become famous soon enoughashkulz will become famous soon enoughashkulz will become famous soon enoughashkulz will become famous soon enough
 
ashkulz's Avatar
 
Posts: 350
Karma: 705
Join Date: Dec 2006
Location: Mumbai, India
Device: Kindle 1/REB 1200
Quote:
I used your batch file and changed the batch file conversion directory from "My Desktop" to another drive on my computer. It works! I guess there might be some restriction of user access issue involved, but I am not sure about that.
Did you use the new batch file and if so, did you run from both Desktop and some other place? There's no logical reason I can think of why it shouldn't run from Desktop -- did you get the same error as before or something else when you ran from there?

Quote:
Some files are converted with no problem, others are still with this annoying "unable to determine total number of pages" problem. I later find that those files that cannot be converted include: 1. pdf files with OCR text underneath the image, 2. pdf files with non-alphabet file names. Hope it can be dealt with in later releases.
That happens when pdftk cannot report how many pages there are in a document. You'll have to manually open each such document and find out how many pages there are and enter it. Can you link/post a sample file? I'll have to see how to detect the page count for those files -- they look like their information dictionary is corrupt or something.
ashkulz is offline   Reply With Quote
Old 04-29-2007, 04:23 AM   #26
gdxf
Enthusiast
gdxf began at the beginning.
 
Posts: 48
Karma: 27
Join Date: Oct 2006
Device: Sony Reader PRS-500
Quote:
Originally Posted by ashkulz
Did you use the new batch file and if so, did you run from both Desktop and some other place? There's no logical reason I can think of why it shouldn't run from Desktop -- did you get the same error as before or something else when you ran from there?

That happens when pdftk cannot report how many pages there are in a document. You'll have to manually open each such document and find out how many pages there are and enter it. Can you link/post a sample file? I'll have to see how to detect the page count for those files -- they look like their information dictionary is corrupt or something.
Yes, I did use the new batch file. It worked well in any other places except on desktop directories. But that doesn't matter very much for me, the point is it at least worked elsewhere.

I manually put in the page number and it encountered the decoding error. I've posted the command line error info below and also attached the zipped directory and problematic file. I think it is because the filename is non-unicode...

---------------------------------------------

Unable to determine total number of pages in document
Please enter number of pages: 2

Page 1/2: EXTRACT RASTERIZE CROP DILATE SPLIT SAVE DONE
Page 2/2: EXTRACT RASTERIZE CROP DILATE SPLIT SAVE DONE
Creating BBeB file ... Traceback (most recent call last):
File "pdfread.py", line 201, in <module>
File "pdfread.py", line 86, in main
File "output.pyo", line 212, in generate
File "pylrs\pylrs.pyo", line 472, in renderLrf
File "pylrs\pylrs.pyo", line 250, in toLrf
File "pylrs\pylrs.pyo", line 246, in toLrfDelegates
File "pylrs\pylrs.pyo", line 250, in toLrf
File "pylrs\pylrs.pyo", line 246, in toLrfDelegates
File "pylrs\pylrs.pyo", line 561, in toLrf
File "pylrs\elements.pyo", line 68, in toString
File "pylrs\elements.pyo", line 76, in write
File "pylrs\elements.pyo", line 51, in _write
File "pylrs\elements.pyo", line 51, in _write
File "pylrs\elements.pyo", line 42, in _write
File "pylrs\elements.pyo", line 25, in _writeAttribute
File "pylrs\elements.pyo", line 13, in _encodeCdata
File "encodings\utf_8.pyo", line 16, in decode
UnicodeDecodeError: 'utf8' codec can't decode byte 0xb1 in position 0: unexpecte
d code byte
Press any key to continue . . .

Last edited by gdxf; 04-29-2007 at 06:04 PM.
gdxf is offline   Reply With Quote
Old 04-29-2007, 11:32 AM   #27
ashkulz
Addict
ashkulz will become famous soon enoughashkulz will become famous soon enoughashkulz will become famous soon enoughashkulz will become famous soon enoughashkulz will become famous soon enoughashkulz will become famous soon enoughashkulz will become famous soon enough
 
ashkulz's Avatar
 
Posts: 350
Karma: 705
Join Date: Dec 2006
Location: Mumbai, India
Device: Kindle 1/REB 1200
Quote:
Originally Posted by gdxf
I manually put in the page number and it encountered the decoding error. I've posted the command line error info below and also attached the zipped directory and problematic file. I think it is because the filename is non-unicode...
Yes, you're right -- it did fail because of non-unicode filename (I think you have some kind of chinese/japanese encoding). That's a limitation of pylrs, you have to use the utf8 encoding (although this can be overridden, but is way too much trouble to implement and get it right).

If you ensure that fonts are embedded in the PDF and the filename doesn't have special characters, it should convert properly.
ashkulz is offline   Reply With Quote
Old 04-29-2007, 05:55 PM   #28
Jary
Member
Jary began at the beginning.
 
Posts: 11
Karma: 10
Join Date: Apr 2007
Device: PRS-500
Hi people.

ashkulz, you did a great job ! I've been using 1.6 and I quite like it.

The install is just perfect.
The prs-500 mode is good, and prs500-l is very nice too. Maybe GUI isn't totally clear the first time, and it misses the .lrf extension on my files, but otherwise it rocks

One thing: why add "title" when there is "output" field ? When you fill output name, shouldn't it be auto copied in title ?

Good job !

Please keep it up.

Last edited by Jary; 04-29-2007 at 05:59 PM.
Jary is offline   Reply With Quote
Old 04-29-2007, 06:09 PM   #29
gdxf
Enthusiast
gdxf began at the beginning.
 
Posts: 48
Karma: 27
Join Date: Oct 2006
Device: Sony Reader PRS-500
Thanks ashkulz! I'll convert the filenames to fit utf8 encoding. I've batch converted a dozen files overnight and it turned out quite well.
gdxf is offline   Reply With Quote
Old 04-30-2007, 01:06 AM   #30
ashkulz
Addict
ashkulz will become famous soon enoughashkulz will become famous soon enoughashkulz will become famous soon enoughashkulz will become famous soon enoughashkulz will become famous soon enoughashkulz will become famous soon enoughashkulz will become famous soon enough
 
ashkulz's Avatar
 
Posts: 350
Karma: 705
Join Date: Dec 2006
Location: Mumbai, India
Device: Kindle 1/REB 1200
Quote:
Originally Posted by Jary
The prs-500 mode is good, and prs500-l is very nice too. Maybe GUI isn't totally clear the first time, and it misses the .lrf extension on my files, but otherwise it rocks

One thing: why add "title" when there is "output" field ? When you fill output name, shouldn't it be auto copied in title ?
You might want to upgrade to 1.7; the extension is now automatically added after processing. The output field is for the output filename which can be anything -- I might want to store books with filename "Author - Title" or any other scheme. That's why I have a separate title field. But yes, you can copy the basic filename as the title (which I do in the batch conversion script) but it's currently not very easy to implement in the GUI (which is actually based on the NSIS installer).
ashkulz is offline   Reply With Quote
Reply


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
PDFRead 1.8.2 released! nrapallo Workshop 372 12-29-2011 11:26 AM
Need help using PDFRead daithi81 Workshop 8 10-16-2009 09:33 AM
Hacks Kindle 2 and PDFRead 1.8 daffy4u Amazon Kindle 38 05-06-2009 09:38 AM
Need help with PDFRead pfisterfarm PDF 8 03-23-2009 09:19 AM
PDFRead v5 available on Sourceforge Alexander Turcic PDF 3 04-08-2007 06:31 AM


All times are GMT -4. The time now is 04:49 AM.


MobileRead.com is a privately owned, operated and funded community.