sendtoiliad.py:
This script will extract metadata from ebooks and write in a form that will be displayed in the contentlister of the Irex Iliad. It would also provide a thumbnail of the cover, replacing the dull, generic file icon.

Formats supported:
Formats support is modular, so new formats can be added by dropping a module in the correct directory. Current modules are:
PDF
Epub (In developement)
Lit (through conversion to Epub) (In developement)

Platform support:
This script was developed on Ubuntu 8.10. I do not have other systems available to test, but it should work on any modern Unix-ish system, including Mac OSX. Windows is not supported, although advanced users may be able to get it to run by changing the mpdule search path. 

Base requirements:
Python with standard libraries (optparse, sys, os, shutil, xml.dom.minidom, datetime)
Python Imaging Libraries (PIL)

PDF Support:
Python standard libraries (re, subprocess, cStringIO)
PyPDF (available from http://pybrary.net/pyPdf/)
pdftoppm (for cover thumbnails, part of the xpdf/poppler utils package)

Epub support:
Python standard libraries and Python Imaging Libraries.

Lit support:
incomplete

Installation:
1. Place sendtoiliad.py in your path. 
2. Place modules (pdf.py) in one of: /usr/share/sendtoiliad, /usr/local/share/sendtoiliad, ~/.sendtoiliad.modules. 
3. Optionally place 'settings' in ~/.sendtoiliad, and edit to your liking. Note that 'True' and 'False' are case-sensitive.

Usage:
Sendtoiliad.py is a commandline tool. Run sendtoiliad.py --help to see help on all options.  Basic usage is as follows:

sendtoiliad.py myebook.pdf
Will create a folder "myebook" in the current directory, with the file, manifest.xml, and cover image, leaving the original file in place.

sendtoiliad.py myebook.pdf -o ~/Iliad/outbox/MMC/books
Will create the folder in your ~/Iliad/outbox/MMC/books, leaving the original file in place.

sendtoiliad.py myebook.pdf -o ~/Iliad/outbox/MMC/books -a ~/Library/New --move
Will create the fold in the outbox as above, place a copy of the original file in ~/Library/New, and remove the original file.

Additional options are available. Please see the help text for a complete list. Also note defaults for all options can be specified in ~/.sendtoiliad/settings. See the included settings file for examples.

Modules
New formats can be created by adding additional modules. A modules is a python modules named after the tag used by the supported format (e.g.: pdf.py is for .pdf files). This module must define a fuction, named init(), which accepts a generic file_info object as an argument and returns a subclass of it.

This object may define any or all of the following methods and properties:

Methods:
  fullscreen(self, manifest): Accepts a minidom object containing the manifest.xml. This should append whatever is needed to instruct the view to open in fullscreen mode, and return the manifest object.
  crop(self, file): Accepts the path to the output file, and does whatever is needed to crop margins to the level specified in the settings file.

Properites:
  file: Text string giving the path of the input file. This is set by the parent class and generally should not be altered.
  title: Text string giving the title of the book (displayed on first line in contentlister).
  author: Text string giving the author of the book (displayed on the second line).
  date: Not currently used.
  pages: Integer giving the number of pages (also displayed on second lne).

All of these methods and properties are optional. If you do not define them, a reasonable default will be inherited from the parent class. None is also an acceptable value for all properties except file.

Format notes:

PDF:

PDF metadata is often omitted or filled with garbage. In these cases, this script will grab the first two lines of text from the first to pages as the title and author, respectively. This works well for most professionally typeset books (e.g. TOR ebooks), and will generally at least give something more usefull than the filename. I do intend to attempt to improve these heuristics, but there are limits on what can be done.

Epub:

Page counts are currently not calculated. Page is a rather fuzzy concept with reflowable formats, making page counts a far more complex subject than would be expected. I do intend to implement this in a fashion compatable with FBReader, but this will take some experimentation.

Cover art is not thumbnailed for all ebooks. This is because the ePub format does not appear to define a standard for cover art. This script will recognise files created using ConvertLit+OEBtoePub, and find the correct artwork for this case. For other books, the first image referenced in the OPF manifest will be assumed to be the cover. Some ebooks (such as those from feedbooks.com) will not have an image in this list, and will therefore not be thumbnailed.
