Register Guidelines E-Books Search Today's Posts Mark Forums Read

Go Back   MobileRead Forums > E-Book Readers > Sony Reader > Sony Reader Dev Corner

Notices

Reply
 
Thread Tools Search this Thread
Old 03-29-2007, 06:48 PM   #1
curiouser
Junior Member
curiouser began at the beginning.
 
Posts: 9
Karma: 10
Join Date: Jul 2006
pythonized PDFrasterFarian

In the interest of advancing the cross-platform development of alex_d's work, I present a reasonable translation over to python. It's not feature complete with the current PDFrasterFarian, but as is, it should give you nice full page PDF conversions. Plus, I've tossed in some new features. I might have waited a little longer to toss this out, but the resulting output from Google Books PDFs is a big improvement, which I thought would interest some.

Also, the Windows-specific dependencies are pretty much gone. It should not be too hard to go from this point to Linux and Mac versions, as well as automate its use as CGI, etc.

Improvements:
- properly centers content
- does nice trimming of PDFs produced from scans (like ones from Google Books - try it out!)
- monochrome output option for smaller (though a bit less legible) PDFs
- eliminates use of AutoImager (using PIL - no speed penalty)

Missing:
- various orientation settings (splitting into 1/2 and 1/4 pages)
- not using temp dir for scratch work
- changing title & author fields

To Do:
- add proper hookup of command line args
- eliminate dependency on pdftk - should be able to digest the PDF in python
- add a little more control for cropping

Installation:
1) install PDFrasterFarian
2) install python
3) install PIL (http://www.pythonware.com/products/pil/)
4) unzip the attached pyprf.zip into same directory

Usage:
python pyprf.py foo.pdf
(sorry, but if you want to tweak processing options, you'll have to tweak the python code)

Enjoy!

P.S> I just noticed the title and author are hardcoded - sorry!
Attached Files
File Type: zip pyprf.zip (4.7 KB, 438 views)

Last edited by curiouser; 03-29-2007 at 06:55 PM.
curiouser is offline   Reply With Quote
Old 03-30-2007, 12:06 AM   #2
ashkulz
Addict
ashkulz will become famous soon enoughashkulz will become famous soon enoughashkulz will become famous soon enoughashkulz will become famous soon enoughashkulz will become famous soon enoughashkulz will become famous soon enoughashkulz will become famous soon enough
 
ashkulz's Avatar
 
Posts: 350
Karma: 705
Join Date: Dec 2006
Location: Mumbai, India
Device: Kindle 1/REB 1200
I have also developed a Python based version using PIL and a Windows installer, you can see it at

http://www.mobileread.com/forums/showthread.php?p=63175

Maybe we can merge our efforts so we get a cross-platform, cross-ebook version?
ashkulz is offline   Reply With Quote
 
Advertisement
Old 03-30-2007, 07:14 PM   #3
Shake
Member
Shake began at the beginning.
 
Posts: 20
Karma: 10
Join Date: Dec 2006
Do I understand you correctly and PDFrasterFarian should work now unter linux with your work? That would really be great news.
Shake is offline   Reply With Quote
Old 03-30-2007, 07:58 PM   #4
curiouser
Junior Member
curiouser began at the beginning.
 
Posts: 9
Karma: 10
Join Date: Jul 2006
The python script no longer needs anything from Windows. However, to get it working under Linux or OS X, a little bit of work needs to be put in. First of all, your Linux system must have the following installed:

- pdftk
- ImageMagick
- ghostscript
- pdftops (which is part of xpdf)

Luckily, all of the above are either part of a standard Linux install, or there should be packages available to install them.

Then, the python script need to be changed to fix the relevant paths. You're looking at stripping out the paths from the following lines (since the execs should be in your path):

gs_exec = "software\\gs\\gs8.54\\bin\\gswin32c.exe"
im = "software\\ImageMagick-6.3.1-Q8\\convert.exe"
(pin, pout) = os.popen2('software\\pdftk.exe "%s" dump_data' % (fil))
os.popen2('software\\pdftops.exe -f %d -l %d -eps -pagecrop "%s" prv_1.eps' % (pageNum, pageNum, fil))
(pin, pout) = os.popen2("software\\pdftk.exe %s dump_data" % (fil))

Plus, you'll have to make some change to the following line (I suspect you can set it equal to "", since ghostscript will be fully installed, and the items will be found):

gs_includes = '-I.\\software\\gs\\gs8.54\\lib -I.\\software\\gs\\gs8.54\\Resource -I.\\software\\gs\\fonts'

The last big thing you'll have to deal with is replacing the following:

os.popen2('software\\lrs2lrf\\lrs2lrf.exe "%s" "%s"' % (fil[:-4] + ".lrs", fil[:-4] + ".lrf"))

I'm not sure what the equivalent is on Linux, but I assume something is out there. Otherwise, you're probably looking at a conversion over to using the pylrs stuff put out by Falstaff (I've started playing with this, but there is a lot about LRS/LRF I don't understand yet).

Finally, you'll need a copy of modules/book_thumb.gif, which you can get from PDFrasterFarian.

I don't have a Linux system going right now, so I can't do this myself. Hopefully this provides enough of a roadmap.
curiouser is offline   Reply With Quote
Old 03-30-2007, 11:49 PM   #5
ashkulz
Addict
ashkulz will become famous soon enoughashkulz will become famous soon enoughashkulz will become famous soon enoughashkulz will become famous soon enoughashkulz will become famous soon enoughashkulz will become famous soon enoughashkulz will become famous soon enough
 
ashkulz's Avatar
 
Posts: 350
Karma: 705
Join Date: Dec 2006
Location: Mumbai, India
Device: Kindle 1/REB 1200
Quote:
Originally Posted by Shake
Do I understand you correctly and PDFrasterFarian should work now unter linux with your work? That would really be great news.
The script I mentioned above (PDFRead) does work on Linux, I developed it primarily on Ubuntu and did the Windows port later on.
ashkulz is offline   Reply With Quote
Old 03-31-2007, 12:24 PM   #6
Shake
Member
Shake began at the beginning.
 
Posts: 20
Karma: 10
Join Date: Dec 2006
I have tried the thing ashkulz has build.

I strongly suggest that you merge your work.
Shake is offline   Reply With Quote
Old 04-01-2007, 03:42 AM   #7
alex_d
Addict
alex_d doesn't litteralex_d doesn't litter
 
Posts: 303
Karma: 187
Join Date: Dec 2006
Device: Sony Reader
yeah, i'd love it if we could all collaborate.


anyone reading this also good at GUIs? maybe .net or java that'll work cross-platform (eg using mono)? Besides getting the nuts and bolts working smoothly across platforms, what i'd really like to see is a good UI that allows you to collate various sources (e.g. folders of images, djvu, etc) and then manually crop the pages properly. That would really make for a great, all-in-one tool.
alex_d is offline   Reply With Quote
Old 04-01-2007, 03:54 AM   #8
ashkulz
Addict
ashkulz will become famous soon enoughashkulz will become famous soon enoughashkulz will become famous soon enoughashkulz will become famous soon enoughashkulz will become famous soon enoughashkulz will become famous soon enoughashkulz will become famous soon enough
 
ashkulz's Avatar
 
Posts: 350
Karma: 705
Join Date: Dec 2006
Location: Mumbai, India
Device: Kindle 1/REB 1200
Well, I was talking to kovidgoyal (author of libprs500) and he was thinking of calling PDFRead in its GUI so that there'd be an end-to-end solution for the Sony Reader. I was also planning to contribute to make a REB/EBW backend, but I don't know how feasible that would be -- the device capabilities are completely different.

Maybe we should all get together on chat and have a brainstorming?
ashkulz is offline   Reply With Quote
Old 04-01-2007, 11:44 AM   #9
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 26,450
Karma: 5383257
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
I vote for a PyQt GUI. I'm willing to do the work of making the GUI. I can write it so that it can be run both as a standalone as well as from within libprs500. We can then use py2exe and py2app to make standalone executables for the windows and osx users who don't want to download the full libprs500.
kovidgoyal is offline   Reply With Quote
Old 04-02-2007, 03:10 AM   #10
alex_d
Addict
alex_d doesn't litteralex_d doesn't litter
 
Posts: 303
Karma: 187
Join Date: Dec 2006
Device: Sony Reader
PyQt will work on windows, and without [significant] dependencies?

Anyway, what's certain is that we need to split the project up into a backend and frontend. The backend could then be used by other people for their front-ends. It should also be modular, so that, e.g., people could pass to it documents like pdfs or they could pass raw images and have them be post-processed and collated. Or they might pass a pdf and the output resolution, and then receive a folder of images they could collate themselves. Thus, the back end could be extended for use with unforseen input and output formats.

Thus the backend is in three parts: a rasterizing stage (this stage is capable of autocropping, it can also produce output directly for the frontend for showing previews and for setting manual cropping), a processing stage (which takes lists of images, cropping parameters, processing parameters, etc.), and a collating stage.


One fun question is how can the processing stage be made better. Right now it's a dilate and a sharpening. The dilate is straightforward but the sharpening has many parameters. I used defaults for PDFR 2.1 but tweaked them for 2.2 for more effect. I don't think anyone actually got to see the 2.2 changes, but Ashkulz currently feels the benefits of sharpening are negligible (at least on his monochrome LCD device).

I disagree, but what I think is certain is that in the vast world of photoshop filters, there's gotta be something impressive. Come on... who here's been C&Ping beavers on monkeys and passing them off as paris hilton pics? Speak up!
alex_d is offline   Reply With Quote
Old 04-02-2007, 11:51 AM   #11
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 26,450
Karma: 5383257
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
Yup you can use py2exe to embed all the dependencies into a single executable.
kovidgoyal is offline   Reply With Quote
Old 04-02-2007, 02:19 PM   #12
ashkulz
Addict
ashkulz will become famous soon enoughashkulz will become famous soon enoughashkulz will become famous soon enoughashkulz will become famous soon enoughashkulz will become famous soon enoughashkulz will become famous soon enoughashkulz will become famous soon enough
 
ashkulz's Avatar
 
Posts: 350
Karma: 705
Join Date: Dec 2006
Location: Mumbai, India
Device: Kindle 1/REB 1200
Quote:
Anyway, what's certain is that we need to split the project up into a backend and frontend. The backend could then be used by other people for their front-ends. It should also be modular, so that, e.g., people could pass to it documents like pdfs or they could pass raw images and have them be post-processed and collated. Or they might pass a pdf and the output resolution, and then receive a folder of images they could collate themselves. Thus, the back end could be extended for use with unforseen input and output formats.

Thus the backend is in three parts: a rasterizing stage (this stage is capable of autocropping, it can also produce output directly for the frontend for showing previews and for setting manual cropping), a processing stage (which takes lists of images, cropping parameters, processing parameters, etc.), and a collating stage.
I think we should keep things as simple as they can be, as it makes for easier maintainability. Let's just create one (or maybe two) command line tools, and that's it. The GUI simply calls these and gets its job done. It will also enforce clean separation and make sure that additional frontends can be added.

What we really need to think about is packaging. On Windows, it is quite easy to package all the required stuff, it will much more difficult to do so for Linux or OS X.

Quote:
One fun question is how can the processing stage be made better. Right now it's a dilate and a sharpening. The dilate is straightforward but the sharpening has many parameters. I used defaults for PDFR 2.1 but tweaked them for 2.2 for more effect. I don't think anyone actually got to see the 2.2 changes, but Ashkulz currently feels the benefits of sharpening are negligible (at least on his monochrome LCD device).

I disagree, but what I think is certain is that in the vast world of photoshop filters, there's gotta be something impressive. Come on... who here's been C&Ping beavers on monkeys and passing them off as paris hilton pics? Speak up!
Well, I was checking the effects of sharpening on my desktop, not on the reader -- I couldn't see any noticeable difference in the images.

Either way, I see that a standalone GUI has very little utility value -- the rudimentary GUI on Windows is good enough for most people, and on Linux/OSX people are OK with the command line. The real benefit as I see it is integration with a device communication app like libprs500/rebcomm, integrated into a rudimentary library management system (like libprs500-gui). That would offer a 1-stop solution for ebook creation, management and communication with the device.

Ideally, we should develop the GUI on a plugin-style architecture, so that people can easily integrate various apps (rss/word/whatever) without touching the main code.
ashkulz is offline   Reply With Quote
Old 04-02-2007, 04:03 PM   #13
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 26,450
Karma: 5383257
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
I agree with ashkulz. As I said before I'm willing to do the work necessary to

a) Make the libprs500 GUI device independent by defining a set of functions that any device communication software will be expected to provide. I will then refactor the GUI code to achieve as clean a separation between the device communication backend and the GUI as possible. That way, any future device driver writers can use the library management features of the GUI easily.

b) Integrate a conversion GUI into the libprs500-gui. I'm working on a fully open source HTML->LRF converter and I'm happy to add the PDF->LRF/ebook optimized PDF converter as well
kovidgoyal is offline   Reply With Quote
Old 04-03-2007, 06:08 PM   #14
Shake
Member
Shake began at the beginning.
 
Posts: 20
Karma: 10
Join Date: Dec 2006
That are great news :-). Keep on the good work!
And (but I am sure this is not an important point for you) - I am willing to donate some dollars for the project if it will run on Linux...
Shake is offline   Reply With Quote
Old 04-03-2007, 08:57 PM   #15
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 26,450
Karma: 5383257
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
$$ are always welcome ;-)

Quote:
Originally Posted by Shake
That are great news :-). Keep on the good work!
And (but I am sure this is not an important point for you) - I am willing to donate some dollars for the project if it will run on Linux...
kovidgoyal is offline   Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search

Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
PRS-500 PDFrasterFarian v2.0 beta alex_d Sony Reader Dev Corner 165 10-29-2012 03:57 PM
PDFRead on Mac OS X -- PDFRasterFarian for OS X! sammykrupa PDF 12 11-07-2009 10:18 PM
PRS-500 PDFrasterFarian - makes A4/Letter PDFs usable alex_d Sony Reader Dev Corner 120 09-10-2007 02:41 PM
PDFRasterFarian Installation fatalfunnel Sony Reader 2 04-01-2007 11:07 PM
Making DJVUs readable using Acrobat Professional and PDFrasterFarian jenia Sony Reader 1 01-19-2007 11:27 AM


All times are GMT -4. The time now is 02:00 PM.


MobileRead.com is a privately owned, operated and funded community.