View Single Post
Old 08-22-2007, 01:54 PM   #1
cacapee
Connoisseur
cacapee is no ebook tyro.cacapee is no ebook tyro.cacapee is no ebook tyro.cacapee is no ebook tyro.cacapee is no ebook tyro.cacapee is no ebook tyro.cacapee is no ebook tyro.cacapee is no ebook tyro.cacapee is no ebook tyro.cacapee is no ebook tyro.
 
Posts: 77
Karma: 1393
Join Date: Aug 2007
Location: Santa Monica
Device: prs-500
Yet another PDF to LRF converter

Moderator's Note: I've taken down the attached programs because it appears to be in violation of GPL. They will go back up after the issue is resolved.

Nate the great



Hi, I've taken some of the ideas of existing tools along with a few refinements of my own to code this one up. I've attached a few sample conversions to get an idea of what the tool can do.

The refinements are --runpages (which causes adjacent pdf pages to be spliced into the same image if possible) and --smartcut (which avoids the annoying splits at the edge of the image) Another feature is that landscape mode (which is the default) rotates the image but doesn't actually use the Reader's landscape mode.

f_l.lrf outputs the page that can be viewed by rotating the reader (-rs)
f_p.lrf outputs the page in portrait mode (-prs)
f_po.lrf outputs the page without runpages and smartcut (-p)
g_2col.lrf sample of two column mode (-vrs)
comicl.pdf comicp.pdf strips in landscape and portrait mode (uses --nosplitpage -rs)

The pdf files tested are linked here

http://cm.bell-labs.com/cm/ms/what/s...hannon1948.pdf
http://www.comp.nus.edu.sg/~tants/tsm/tsm.pdf

Code:
pdflrf 0.99

A program to generate lrf files for the Sony Reader
Needs Ghostscript to be installed unless you are using poppler

Usage: For 2 column portrait mode
        pdflrf.exe -vrs -i file.pdf -o file.lrf
   For landscape mode
        pdflrf.exe --rotation=-90 -rs -i file.djvu -o file.lrf
   For comics
        pdflrf.exe -nrs --erode 1 -i file.cbz -o file.lrf


  -h, --help                  Print help and exit
  -V, --version               Print version and exit
  -i, --input=STRING          Input file (PDF, DJVU, CBZ)
  -o, --output=STRING         Output file
  -f, --firstpage=INT         First page to process  (default=`1')
  -l, --lastpage=INT          Last page to process  (default=`-1')
      --ghostscript           Use ghostscript (instead of poppler by default)
                                to read pdf files  (default=off)

LRF file metadata
  -t, --title=STRING          Title
  -a, --author=STRING         Author
      --category=STRING       Category
      --publisher=STRING      Publisher

LRF file generation properties
      --fit=STRING            Scale image  (possible values="width",
                                "height", "2xheight" default=`width')
      --rotation=STRING       Rotation  (possible values="-90", "0",
                                "90", "180" default=`-90')
      --filter=STRING         Resizing filter  (possible values="lanczos",
                                "quadratic", "cubic", "catrom",
                                "mitchell", "sinc", "bessel"
                                default=`lanczos')
      --stretch               Stretch image to fit screen  (default=off)
  -v, --vsplit                Vertically split the page  (default=off)
  -r, --runpages              Run pages together  (default=off)
      --pad=INT               Pad (in pixels) to add when concatenating pages
                                (default=`3')
  -s, --smartcut              Cut pages at blank lines  (default=off)
  -n, --nosplitpage           Do not split pages across images  (default=off)
      --notoc                 Do not add TOC  (default=off)

Image processing
      --erode=INT             Size of erosion kernel (2 works well too)
                                (default=`3')
      --overlap=FLOAT         Overlap % between successive pages
                                (default=`0.05')
  -c, --colors=INT            Number of colors in final image  (default=`4')
      --grayscale             Convert image to grayscale  (default=on)
      --nocrop                Do not crop the sides automatically
                                (default=off)
      --outputimages          Write generated images out  (default=off)
      --width=INT             Width of final image  (default=`584')
      --height=INT            Height of final image  (default=`754')

The following options are run before any of the above processing is done
      --trimleft=FLOAT        Trim width*trimleft/100 pixels from the left for
                                all pages  (default=`0')
      --trimright=FLOAT       Trim width*trimright/100 pixels from the right
                                for all pages  (default=`0')
      --trimtop=FLOAT         Trim height*trimtop/100 pixels from the top for
                                all pages  (default=`0')
      --trimbottom=FLOAT      Trim height*trimbottom/100 pixels from the bottom
                                for all pages  (default=`0')
      --eventrimleft=FLOAT    Trim width*trimleft/100 pixels from the left for
                                even pages  (default=`0')
      --eventrimright=FLOAT   Trim width*trimright/100 pixels from the right
                                for even pages  (default=`0')
      --eventrimtop=FLOAT     Trim height*trimtop/100 pixels from the top for
                                even pages  (default=`0')
      --eventrimbottom=FLOAT  Trim height*trimbottom/100 pixels from the bottom
                                for even pages  (default=`0')
      --oddtrimleft=FLOAT     Trim width*trimleft/100 pixels from the left for
                                odd pages  (default=`0')
      --oddtrimright=FLOAT    Trim width*trimright/100 pixels from the right
                                for odd pages  (default=`0')
      --oddtrimtop=FLOAT      Trim height*trimtop/100 pixels from the top for
                                odd pages  (default=`0')
      --oddtrimbottom=FLOAT   Trim height*trimbottom/100 pixels from the bottom
                                for odd pages  (default=`0')
      --fuzz=FLOAT            Fuzz factor % for matching colors
                                (default=`0.01')
version 0.2 now reads in cbz and djvu files. It also reads pdf files using poppler - so you do not need to install ghostscript.

version 0.3 has a windows gui. --portrait has been removed and replaced by --rotation. Default is portrait now. Also metadata from pdf files are automatically read

version 0.4 adds drag/drop and batch processing (drag mutiple files over) + other small refinements (not specifying smartcut/runpages/splitpage scales a page to fit in an image)

version 0.5 has support for pre-trimming off the input pages. These are specifed in terms of % of the page from the left/right/top/bottom that you want to trim away. They can also be specified independently for even/odd pages. Also unicode metadata is now supported.

version 0.6 adds support for unicode filenames (previously only unicode metadata was supported), cbr/rar files, better error reporting and assorted bug fixes.

version 0.7 adds preview of pre trimming, catrom filter for resizing images, png files are now generated for embedding in lrf files. Converted over to using threads - so batching is a lot more improved.

version 0.8 adds support for Table of Contents in pdf files. It is possible to preview output images to test out various settings. An experimental linux build (built on Ubuntu) has been added. Improved threading support so processing should be faster. Changed default colors to 4 to reduce frequency of file size questions. Added more filtering options and better dithering so images should look a lot better.
Fixed threading bug that causes it to hang occasionally under linux. Fixed TOC. Erosion now works sensibly with small images (like those of comic strips). Can also go down to 2 colors.

version 0.9 Added padding (in pixels) when using runpages. Fixed crash bug when generating toc. Added back ghostscript support. Added option to disable generation of TOC.

version 0.99 Fixed Librie compatibility (maybe). Adjustable image size on output. Output zip files (if extension is cbz or zip). Added RGB output. Added post stretch of image to fit page. Added options to fit image by height/width/2*height etc. Sort files in rar and zip.

Fixed bug in dos and linux commandline versions that ignored toc. Fixed metadata bug.

I've broken out the dos commandline, windows gui and linux commandline versions into separate files to better track usage.
Attachments Pending Approval
File Type: lrf f_l.lrf
File Type: lrf f_p.lrf
File Type: lrf g_2col.lrf
File Type: lrf comicp.lrf
File Type: zip pdflrfwin-0.9.zip
File Type: zip pdflrfdos-0.9.zip
File Type: gz pdflrflinux-0.9.gz
File Type: zip pdflrfwin-0.99.zip
File Type: gz pdflrflinux-0.99.gz
File Type: zip pdflrfdos-0.99.zip

Last edited by cacapee; 09-30-2007 at 06:17 PM.
cacapee is offline