View Single Post
Old 04-25-2007, 08:02 AM   #1
ashkulz
Addict
ashkulz will become famous soon enoughashkulz will become famous soon enoughashkulz will become famous soon enoughashkulz will become famous soon enoughashkulz will become famous soon enoughashkulz will become famous soon enoughashkulz will become famous soon enough
 
ashkulz's Avatar
 
Posts: 350
Karma: 705
Join Date: Dec 2006
Location: Mumbai, India
Device: Kindle 1/REB 1200
PDFRead 1.7 released

UPDATE: PDFRead 1.7 has been released. The changes for 1.7 and batch conversion instructions are mentioned first, then followed by the inital release announcement for 1.6.

I've released PDFRead 1.7, which has minor bug fixes and enhancements. Changes in this release:
  • add a "landscape-half" mode which splits a page into two even halves (gdxf's suggestion)
  • if the output document does not have the proper file extension, then append it automatically.
  • remove imagemagick and use pngnq for color reduction.
  • fix the problems if the PDF has an incorrect TOC referring to an invalid page. Also added option --no-toc to disable TOC generation.

Also, batch conversion can now be done on Windows for all PDFs in a folder.
  1. Download the file attached to linked post and rename it as pdfread-batch.bat
  2. Open up the renamed file, and change the set OPT= line to use the appropriate profile. In case you have installed in a non-default location, change the set LOC= line too.
  3. Copy the batch file into a directory where you want to convert, and double click on it. Please do not put the directory anywhere on the Desktop or My Documents, it can cause some problems. Put it somewhere in the root of your drive ( C:, D: )
  4. The filename will be used as the book title, so be sure to name files properly. Please ensure that the filename does not contain special characters not present in UTF-8. A ebook with be created with the same name (but with given extension ie. sample.pdf => sample.lrf).
In case you want to customize further:
  1. Do a normal conversion with your custom params for a single file and copy the command line options to a text file. Some advice on how to copy the options from the window:
    Quote:
    Originally Posted by alex_d
    To copy text from a CMD window, right-click on the title bar (the bar that has the X and minimize buttons), choose properities, and then enable QuickEdit mode. This lets you highlight text and copy it by right-clicking on it. Copy everything, even if you have to scroll up.
  2. Copy the command line parameters and replace the set OPT= mentioned above. Do NOT include the input filename, the title (-t option) or the call to pdfread, just the options. The value should be valid command line options.

People on OS X/Linux can hack together a similiar script very easily, so I won't bother to post it. If you do want such a script, let me know.


Original announcement follows


After a long wait, PDFRead 1.6 has been released. You can download from PDFRead @ SourceForge.

The focus on this release has been to rewrite the code for better maintainability. It can now be easily integrated into other tools. PDFRead now has a plugin based architecture, which will allow new features to be added easily -- which I've already done for this release.

Lots of new image processing options have been added to PDFRead. unpaper integration ensures that bad scans will be cleaned up properly. The new cropping algorithm removes whitespace very agressively, even from the middle of the page without any loss of content. All images are now run through an edge-enhancement filter, which is the same one used by both rbmake and RasterFarian.

Support for the TIFF and IMGLIST input formats has been added. The IMGLIST format is a simple text file containing a list of images which are to be considered as a single document.

Batch support is not directly present for Windows, but can be achieved via a batch file. The command line used to convert each book (using the current settings) is printed before conversion. You can then copy this to tweak your conversion settings. Users of Linux/OS X are assumed to be familiar with the command-line, and the batch support can be achieved by scripting.

You can also specify a range of pages for conversion. This has the side-effect of giving a preview feature, as specifying the same page as the start and end page will run the processing only for that page.

The Windows GUI has been revamped: there are now tooltips everywhere, and there is no "advanced" page anymore. If you do want to control those parameters, please use the command line directly.

Lots of other minor tweaks have gone into this release.

The detailed changelog for this release:
  • revamped the Windows GUI: added tooltips, preview feature and show the command line options when executed (useful for batch execution).
  • add support for TIFF and a list of page images for input.
  • add unpaper support for image cleanup.
  • add extremely agressive whitespace detection, even in the middle of the page text.
  • added an edge-enhancement filter, similiar to rbmake and RasterFarian.
  • allow all processing stages to be selectively disabled.
  • allow a page range to be specified for conversion.
  • tweak the prs-500 profile to rotate right instead of left (thanks gdxf)
  • add an optional step to optimize generated PNG images via OptiPNG.
  • removed the dependency on xpdf.
  • removed the autocontrast and ghostscript cropping features (no longer useful).
  • fix problem where the IMP file was not created if the latest eBook Publisher was not installed.
  • complete overhaul of the code for better maintainability.
Some screenshots of the effect of the various image processing options are also attached.
Attached Thumbnails
Click image for larger version

Name:	dilation_before.png
Views:	1792
Size:	35.9 KB
ID:	3291   Click image for larger version

Name:	dilation_after.png
Views:	1691
Size:	36.8 KB
ID:	3292   Click image for larger version

Name:	crop_before.png
Views:	1668
Size:	21.2 KB
ID:	3293   Click image for larger version

Name:	crop_after.png
Views:	1616
Size:	19.6 KB
ID:	3294   Click image for larger version

Name:	unpaper_before.png
Views:	1641
Size:	89.9 KB
ID:	3295   Click image for larger version

Name:	unpaper_after.png
Views:	1689
Size:	78.1 KB
ID:	3296   Click image for larger version

Name:	edge_enhance_before.png
Views:	1654
Size:	47.3 KB
ID:	3297   Click image for larger version

Name:	edge_enhance_after.png
Views:	1541
Size:	36.8 KB
ID:	3298  

Last edited by ashkulz; 04-30-2007 at 01:19 AM.
ashkulz is offline   Reply With Quote