PRS-500 pythonized PDFrasterFarian - Page 2

alex_d · 04-04-2007, 02:09 AM

Ashkulz, it is most important right now to create or plan for a framework that can be easily extended or repurposed for new things and by new applications. Anyway, all it'd be is just three executables. And they'll be doing the same things the monolithic thing is doing now. there'll just be a bit more work talking through a defined interface, but it'll pay off through much greater flexibility. Flexibility to change pieces

how about we discuss a spec?

ok, so, the rasterizer exe would expose some of the things ghostscript, etc should be able to do:
-- input - pdf file or list of files
-- output - output folder and filename
-- output size in pixels and format (8bit, gray, color)
-- autocropping, explicit cropbox
-- (opt) output file type (png, jpg, bmp, raw)
-- (opt) rotation
-- (opt) device-specific features (eg ghostscript's font-rendering modes)
this exe prints out the names of the files it processes so that these could be piped or saved to a variable (or to a file). The other exes should be able to accept input filenames piped in (and maybe from a file).

the processing exe would be:
-- input/output filenames
-- output resolution, format
-- (opt) fit (centered, upper-left, stretched)
-- (impl-specific, opt) dilate factor
-- (impl-specific, opt) eg sharpen or other filter parameters

collating exe would just take a list of files and bind them into a format for some specific device. it would also accept a TOC as a file or something. (people could write new .exe's to add support for new/old devices and file formats)

misc ideas-
overcropping... option to crop not at the first black pixel but only after, say, a few dozen (so dust, dots, or lines don't mess up autocropping)
output filenames... imagemagic etc can take output filename as eg "fileA%02d.png" and produce fileA01.png, fileA02.png

I think a standalone app would be used more than an integrated one. Personally, i just use sd cards and never sony connect. Also, a standalone app can focus better on adding support to do all the things that could give the best results. Maybe doing it in qt will make it more difficult to do something fancy that lets you preview, crop, rotate, etc. I don't know, but i know that manually cropping in acrobat is very, very helpful. However, I've never found a free alternative to do manual cropping.

curiouser · 04-04-2007, 12:25 PM

Sounds like good stuff is happening. I'm swamped closing out my last semester of school, so I won't be able to contribute for a bit.

Just wanted to point out two bits of code from my work that may be the most useful:

1) overcropping is already implemented - check the trimNoise function. Big help for scanned PDFS (such as Google Books).
2) proper centering of images. Related code is found in trimNoise as well as the main processing function.

ashkulz · 04-04-2007, 03:42 PM

Quote:

Ashkulz, it is most important right now to create or plan for a framework that can be easily extended or repurposed for new things and by new applications. Anyway, all it'd be is just three executables. And they'll be doing the same things the monolithic thing is doing now. there'll just be a bit more work talking through a defined interface, but it'll pay off through much greater flexibility. Flexibility to change pieces

how about we discuss a spec?

Agreed. I've been thinking about this too (couldn't post as was too busy yesterday). The most sensible way to design the system to think of it in terms of a pipeline, exactly the way GStreamer is designed. So effectively, you write a lot of plugins which implement discrete actions in the whole process (rasterizing, cropping, dilation, etc -- all that you mentioned). Each plugin declares some input and output "pads". We can create different types of "pads", so that you can't accidentally connect an incompatible set of inputs/outputs. I will assume that we write all of this in Python, which I think is best as it is cross-platform. So now we have the following components in the system:

the base framework, which defines interactions and the various types of pads
the actual plugins, which use the framework and define various types of input/output pads
a low-level command line interface which will allow one to create a pipeline and execute it, similiar to gst-launch See this as an example.
a command line app, which will parse command-line parameters, do some validations and then finally create a pipeline and execute it via the plugins/framework.

So #4 will be essentially a replacement for what we currently have. Apps who want to use a part (or any combination of the pipeline) will essentially use #3 directly. So ideally, we should have very "thin" glue code in #3 and #4, with most of the logic being in #1 and #2. Also, trying out new approaches is very painless, as it is easy to add a new plugin and introduce in the pipeline via #3. As an example, the current process can be represented as

Code:

filesrc location=input.pdf ! pdftops ! gsrasterize dpi=300 ! autocrop ! dilate ! resize width=565 height=784 ! makelrf author=XYZ title=foo | libprs500-send

.

I don't know if you're familiar with electronics/IC design, but that's what essentially what you do there. It would make development MUCH easier and make the whole process much more easier to tweak for everyone (once the initial bump is past, of course). So let's say I want to use xpdf for rasterizing (it's much smaller than gs on win32), I replace gsrasterize with xpdfrasterize (which is the only thing I need to write) and then recreate/rerun the pipeline.

Quote:

ok, so, the rasterizer exe would expose some of the things ghostscript, etc should be able to do:
-- input - pdf file or list of files
-- output - output folder and filename
-- output size in pixels and format (8bit, gray, color)
-- autocropping, explicit cropbox
-- (opt) output file type (png, jpg, bmp, raw)
-- (opt) rotation
-- (opt) device-specific features (eg ghostscript's font-rendering modes)
this exe prints out the names of the files it processes so that these could be piped or saved to a variable (or to a file). The other exes should be able to accept input filenames piped in (and maybe from a file).

the processing exe would be:
-- input/output filenames
-- output resolution, format
-- (opt) fit (centered, upper-left, stretched)
-- (impl-specific, opt) dilate factor
-- (impl-specific, opt) eg sharpen or other filter parameters

collating exe would just take a list of files and bind them into a format for some specific device. it would also accept a TOC as a file or something. (people could write new .exe's to add support for new/old devices and file formats)

misc ideas-
overcropping... option to crop not at the first black pixel but only after, say, a few dozen (so dust, dots, or lines don't mess up autocropping)
output filenames... imagemagic etc can take output filename as eg "fileA%02d.png" and produce fileA01.png, fileA02.png

All the features you mentioned above should be implemented as plugins, with the necessary parameters.

Quote:

I think a standalone app would be used more than an integrated one. Personally, i just use sd cards and never sony connect. Also, a standalone app can focus better on adding support to do all the things that could give the best results. Maybe doing it in qt will make it more difficult to do something fancy that lets you preview, crop, rotate, etc. I don't know, but i know that manually cropping in acrobat is very, very helpful. However, I've never found a free alternative to do manual cropping.

To each his choice. I mean, PDFRead is working for most people and that's the way it should be for them. If someone finds they need to do something custom, then with this approach they have a gradual approach for delving in deeper and deeper. With the above approach, whether you use command-line app (#4) or just use the pipeline directly from GUI (#3) becomes irrelevant: both are equally easy to use for different set of people, and it allows other developers to leverage PDFRead as they see fit.

As an aside, we should call it something other than PDFRead or PDFRasterFarian: the above is not merely a tool, it is a ebook conversion framework. I mean, I can imagine that html being a source plugin sometime in the future, so this could be a standard way of interacting with ebook formats, devices and whatnot.

alex_d · 04-04-2007, 11:17 PM

what exactly do you mean by plugins? Do you mean the "rasterizer" and "post-processing" components that i'm talking about would themselves be composed of smaller pieces?

"the above is not merely a tool, it is a ebook conversion framework. I mean, I can imagine that html being a source plugin sometime in the future, so this could be a standard way of interacting with ebook formats, devices and whatnot."

Right now I was just thinking about a framework that handled image-based ebooks. For html, and indeed for a larger audience, you would need to support native-text formats (although i dunno.. native text would never look as good as dilated and processed images). To handle native-text you would need to create an intermediary text format with formatting and embeded links that could carry HTML, pdf, rtf, etc and then be reprocessed into lrf, pdf, starebook, etc. is... ambitious. And it'd have to work perfectly (ie just as well as a direct html->lrf conversion).

If we just stick to working with images (and even claim that's the suprior way to do things) I think it makes things much simpler (and much easier to get right). We can omit things like sophisticated pads that keep track of their own dependencies. Simply moving images from one folder to another would be fine and would even make it easier for other developers to hook in. (It's still the same spirit as the pads, but just a simpler implementation.)

however, let's ask the question: if say we only work with images, what things could/would/would-want-to be done by others? Are there things that can't be done by a 3-layer framework of Create images, Reprocess images, Bind images (provided each layer exposes enough features)? What are the usage scenarios?

ashkulz · 04-05-2007, 12:28 AM

Quote:

what exactly do you mean by plugins? Do you mean the "rasterizer" and "post-processing" components that i'm talking about would themselves be composed of smaller pieces?

Yep, very much. The current script is getting too big, and not so easy to understand at first glance. The "plugins" would allow one to abstract out the steps to take in the pipeline, and then to weave the individual steps in any manner that the calling tool/app chooses.

From the point of view of the calling tool, there would be only one executable which would allow one to choose and setup the pipeline. All the plugins and other low-level details will be in code, and not exposed to the user.

Quote:

the above is not merely a tool, it is a ebook conversion framework. I mean, I can imagine that html being a source plugin sometime in the future, so this could be a standard way of interacting with ebook formats, devices and whatnot.

Right now I was just thinking about a framework that handled image-based ebooks. For html, and indeed for a larger audience, you would need to support native-text formats (although i dunno.. native text would never look as good as dilated and processed images). To handle native-text you would need to create an intermediary text format with formatting and embeded links that could carry HTML, pdf, rtf, etc and then be reprocessed into lrf, pdf, starebook, etc. is... ambitious. And it'd have to work perfectly (ie just as well as a direct html->lrf conversion). If we just stick to working with images (and even claim that's the suprior way to do things) I think it makes things much simpler (and much easier to get right).

Agreed, but what I meant was that it is easily possible to theoretically visualize that some things that we develop here might be integrated as different types of "pads" or whatever. I'm not proposing to do anything on this at all, just that it leaves future scope for expansion -- the framework would already be there, and reuse would be dead simple.

Quote:

We can omit things like sophisticated pads that keep track of their own dependencies. Simply moving images from one folder to another would be fine and would even make it easier for other developers to hook in. (It's still the same spirit as the pads, but just a simpler implementation.)

I never said anything about pads keeping track of their own dependencies. All I meant is, if a particular stage expects input as an image, then we shouldn't be able to pass in a PDF there (or vice versa). The stage should validate these things and then move on.

I disagree about the folder-to-folder thing -- that's a poor solution, as that means we have to create and maintain that many folders. Why communicate over the filesystem when you can communicate much more clearly via code? Also, you get around that in PDFRasterFarian by fixing the stages upfront and pre-creating folders in the installation directory. That is not feasible on other platforms, plus it implicitly means you can run only 1 instance of PDFRasterFarian at 1 time. PDFRead has no such limitation, and I think that supporting (simultaneous) batch processing is very important.

Quote:

however, let's ask the question: if say we only work with images, what things could/would/would-want-to be done by others? Are there things that can't be done by a 3-layer framework of Create images, Reprocess images, Bind images (provided each layer exposes enough features)? What are the usage scenarios?

If each layer exposes enough features to turn on/off features individually, the command line options for it will grow quite a bit (see PDFRead). It is much better to approach it conceptually as a pipeline than as passing these set of parameters to stage1, another set to stage2 and so on.

Usage scenarios are simple:

User A wants to use the framework "as is" in one of the default profiles
User B wants to customize one of the stages in the pipeline. He/she runs a tool that will print the default pipeline for a profile, customizes it and then runs it directly (or saves it directly as a new profile).
User C wants to add or drop stages in the pipeline (e.g. remove dilation for comics, add a manual cropping stage, etc)
User D is a tool writer that wants to integrate the entire conversion process (with preview). This would be easy, as one would run a shorter pipeline or one customized only to process a few pages.

On the whole, I think the most compelling argument would be the transparency and simplicity from the user/tool writer point of view. It will also make the code much more modular and easier to maintain.

alex_d · 04-09-2007, 06:09 AM

"Why communicate over the filesystem when you can communicate much more clearly via code?"

Using folders as pads is a bit dirty (especially for concurrent conversions... although those should really be batched and run sequentially anyway) but it is _somewhat_ elegant and, above all, _very_ easy to hook into and extend.

Say I have a program that can be told from the command-line to accept some input files and create some output files. How would I integrate it into your framework?

"PDFRead has no such limitation, and I think that supporting (simultaneous) batch processing is very important."

Actually, I think batching serially rather than concurrently makes more sense. You get your first output quicker and there is no problem if you want to convert an obscene number of files. (Even a few dozen concurrent conversions would kill the ram).

"If each layer exposes enough features to turn on/off features individually, the command line options for it will grow quite a bit (see PDFRead). "

Well, the command line options wouldn't be for the user to use but for the developer writing a wrapper. Surely it'll be much easier on (and give more freedom to) a developer to code a long command line in his script than to output a custom pipeline file?

In the end, though, there are two questions: Can a sophisticated framework of which you speak be implemented in theory (ie is the concept compatible with being very flexible and easy to extend)? And: Will such a framework be actually implemented by us (ie will it be too much work)? The folders approach, I think, has both points going for it.

I must say, however, I like the cut of jib.

ashkulz · 04-25-2007, 08:08 AM

Okay, I've implemented the ideas which I mentioned here in the 1.6 release. You can look at the code at

http://pdfread.svn.sourceforge.net/v...pdfread/trunk/

Please see the
PDFRead 1.6 thread for other features added in this release.

Shake · 06-02-2007, 10:40 AM

Any progress? Can I try something?

ashkulz · 06-03-2007, 11:39 PM

Quote:

Originally Posted by Shake

Any progress? Can I try something?

I'm not clear on what you mean. PDF Read is already available, so do you mean progress on PDFRead or something else?

04-09-2007, 06:09 AM	#21
alex_d Addict Posts: 303 Karma: 187 Join Date: Dec 2006 Device: Sony Reader	"Why communicate over the filesystem when you can communicate much more clearly via code?" Using folders as pads is a bit dirty (especially for concurrent conversions... although those should really be batched and run sequentially anyway) but it is _somewhat_ elegant and, above all, _very_ easy to hook into and extend. Say I have a program that can be told from the command-line to accept some input files and create some output files. How would I integrate it into your framework? "PDFRead has no such limitation, and I think that supporting (simultaneous) batch processing is very important." Actually, I think batching serially rather than concurrently makes more sense. You get your first output quicker and there is no problem if you want to convert an obscene number of files. (Even a few dozen concurrent conversions would kill the ram). "If each layer exposes enough features to turn on/off features individually, the command line options for it will grow quite a bit (see PDFRead). " Well, the command line options wouldn't be for the user to use but for the developer writing a wrapper. Surely it'll be much easier on (and give more freedom to) a developer to code a long command line in his script than to output a custom pipeline file? In the end, though, there are two questions: Can a sophisticated framework of which you speak be implemented in theory (ie is the concept compatible with being very flexible and easy to extend)? And: Will such a framework be actually implemented by us (ie will it be too much work)? The folders approach, I think, has both points going for it. I must say, however, I like the cut of jib. Last edited by alex_d; 04-09-2007 at 06:19 AM.

Similar Threads
Thread	Thread Starter	Forum	Replies	Last Post
PRS-500 PDFrasterFarian v2.0 beta	alex_d	Sony Reader Dev Corner	165	10-29-2012 02:57 PM
PDFRead on Mac OS X -- PDFRasterFarian for OS X!	sammykrupa	PDF	12	11-07-2009 09:18 PM
PRS-500 PDFrasterFarian - makes A4/Letter PDFs usable	alex_d	Sony Reader Dev Corner	120	09-10-2007 01:41 PM
PDFRasterFarian Installation	fatalfunnel	Sony Reader	2	04-01-2007 10:07 PM
Making DJVUs readable using Acrobat Professional and PDFrasterFarian	jenia	Sony Reader	1	01-19-2007 10:27 AM

04-04-2007, 02:09 AM	#16
alex_d Addict Posts: 303 Karma: 187 Join Date: Dec 2006 Device: Sony Reader	Ashkulz, it is most important right now to create or plan for a framework that can be easily extended or repurposed for new things and by new applications. Anyway, all it'd be is just three executables. And they'll be doing the same things the monolithic thing is doing now. there'll just be a bit more work talking through a defined interface, but it'll pay off through much greater flexibility. Flexibility to change pieces how about we discuss a spec? ok, so, the rasterizer exe would expose some of the things ghostscript, etc should be able to do: -- input - pdf file or list of files -- output - output folder and filename -- output size in pixels and format (8bit, gray, color) -- autocropping, explicit cropbox -- (opt) output file type (png, jpg, bmp, raw) -- (opt) rotation -- (opt) device-specific features (eg ghostscript's font-rendering modes) this exe prints out the names of the files it processes so that these could be piped or saved to a variable (or to a file). The other exes should be able to accept input filenames piped in (and maybe from a file). the processing exe would be: -- input/output filenames -- output resolution, format -- (opt) fit (centered, upper-left, stretched) -- (impl-specific, opt) dilate factor -- (impl-specific, opt) eg sharpen or other filter parameters collating exe would just take a list of files and bind them into a format for some specific device. it would also accept a TOC as a file or something. (people could write new .exe's to add support for new/old devices and file formats) misc ideas- overcropping... option to crop not at the first black pixel but only after, say, a few dozen (so dust, dots, or lines don't mess up autocropping) output filenames... imagemagic etc can take output filename as eg "fileA%02d.png" and produce fileA01.png, fileA02.png I think a standalone app would be used more than an integrated one. Personally, i just use sd cards and never sony connect. Also, a standalone app can focus better on adding support to do all the things that could give the best results. Maybe doing it in qt will make it more difficult to do something fancy that lets you preview, crop, rotate, etc. I don't know, but i know that manually cropping in acrobat is very, very helpful. However, I've never found a free alternative to do manual cropping.

04-04-2007, 12:25 PM	#17
curiouser Junior Member Posts: 9 Karma: 10 Join Date: Jul 2006	Sounds like good stuff is happening. I'm swamped closing out my last semester of school, so I won't be able to contribute for a bit. Just wanted to point out two bits of code from my work that may be the most useful: 1) overcropping is already implemented - check the trimNoise function. Big help for scanned PDFS (such as Google Books). 2) proper centering of images. Related code is found in trimNoise as well as the main processing function.

04-04-2007, 11:17 PM	#19
alex_d Addict Posts: 303 Karma: 187 Join Date: Dec 2006 Device: Sony Reader	what exactly do you mean by plugins? Do you mean the "rasterizer" and "post-processing" components that i'm talking about would themselves be composed of smaller pieces? "the above is not merely a tool, it is a ebook conversion framework. I mean, I can imagine that html being a source plugin sometime in the future, so this could be a standard way of interacting with ebook formats, devices and whatnot." Right now I was just thinking about a framework that handled image-based ebooks. For html, and indeed for a larger audience, you would need to support native-text formats (although i dunno.. native text would never look as good as dilated and processed images). To handle native-text you would need to create an intermediary text format with formatting and embeded links that could carry HTML, pdf, rtf, etc and then be reprocessed into lrf, pdf, starebook, etc. is... ambitious. And it'd have to work perfectly (ie just as well as a direct html->lrf conversion). If we just stick to working with images (and even claim that's the suprior way to do things) I think it makes things much simpler (and much easier to get right). We can omit things like sophisticated pads that keep track of their own dependencies. Simply moving images from one folder to another would be fine and would even make it easier for other developers to hook in. (It's still the same spirit as the pads, but just a simpler implementation.) however, let's ask the question: if say we only work with images, what things could/would/would-want-to be done by others? Are there things that can't be done by a 3-layer framework of Create images, Reprocess images, Bind images (provided each layer exposes enough features)? What are the usage scenarios?

04-25-2007, 08:08 AM	#22
ashkulz Addict Posts: 350 Karma: 705 Join Date: Dec 2006 Location: Mumbai, India Device: Kindle 1/REB 1200	Okay, I've implemented the ideas which I mentioned here in the 1.6 release. You can look at the code at http://pdfread.svn.sourceforge.net/v...pdfread/trunk/ Please see the PDFRead 1.6 thread for other features added in this release.

06-02-2007, 10:40 AM	#23
Shake Member Posts: 20 Karma: 10 Join Date: Dec 2006	Any progress? Can I try something?

Advert

Advert