	Calibre2Web README
	~~~~~~~~~~~~~~~~~~

Author
~~~~~~
	Dave walker (itimpi)
	Email:  Calibre2Web@itimpi.freeserve.co.uk

Copyright
~~~~~~~~~
	Dave Walker, 2008/2009


Description
~~~~~~~~~~~

This is a program that is designed to work in conjunction with the Open
Source Calibre Library Management system, and with ePub format books for
the iPhone/iTouch as used by the Stanza ebook reader.

I developed it for myown personal use, and it is currently in use by myself
and some friends who also have iPhone or iTouch devices.   It occurred to 
me that it might be of use to a wider audience.

The primary purpose of calibre2Web  is to produce a set of hierarchical 
catalogs so that your private book collection can be presented in a 
structured way similar to the online catalogs that come pre-supplied with 
Stanza for the iPhone or iTouch

An important secondary consideration is to provide a simple to use method
of getting ones private book collection onto the iPhone/iTouch without
the need to use Stanza Desktop.   It will also allow those who do not have
wireless networking to get books into Stanza on the iPhone/iTouch far more
easily.

Downloading a whole library to stanza is a very tedious. One might do it to 
gain the benefit of clicking 'subject' and getting an organised list. 
Now, your online storage is sorted by this and other criteria. This means 
that you do not need to download every book to your iPhone/iTouch any more,
but can keep the catalog online and just download 1 book at a time for reading,
then delete when finished.

Calibre2Web creates a static Catalog structure along the lines shown below 
(each level of indent indicating a level of indirection)

Catalog -> Recently Downloaded -> EPub Books

           Authors -> Authors:A  -> Author#    -> Titles:All -> EPub books
                      ...           Seriesx:   -> Series#    -> EPub books          
                      Authors:Z                          (Same files as under Series branch)
                      

           Series ->  Series:A  -> Series# ->	EPub books
                      ...
                      Series:Z

           Titles ->  Titles:A  -> EPub Books
                      ...         ...
                      Titles:Z

           Categories-> CategoryName-> Category:A ->EPub books
                                        ...
                                       Category:Z
							...
 Special Cases are:
 - Books tagged as PAPER have [P] appended to title
 - Books tagged as WISHLIST or WANTED have [W] appended to title
 - Books tagged as OMNIBUS have (Omnibus) appended To title
 - If no cover file then link for cover omitted.
 - If no EPUB format file then link for download omitted

The Categories correspond to the Tags at the calibre level.


Calibre
~~~~~~~

Calibre is an Open source Library management system for use with eBooks.
It provides powerful facilities for managing books and for converting 
between various eBook formats so that it is usable with a wide range of
eBook type devices.

In particular it supports the Sony Reader family with conversions to LRF.
Support has been added for converting to the EPUB format as used by the 
Stanza eBook reader on the iPhone/iTouch. Calibre2Web is targetted at this
type of user.

As I personally have both these devices the ability handle them both through
a single piece of software has proved extremely attractive.

In addition, as it is Open Source, the Calibre product is tending to develop
very quickly and is continually enhancing its functionality.  Conversions to
other formats such as LIT (Microsoft Reader) and MOBI (various devices 
includinthe Amazon Kindle) are under development


Restrictions
~~~~~~~~~~~~

1. The Calibre2Web program is written in vbscript.  This means that it 
   will only run on Windows based systems.   It will not run on Linux 
   or MacOS as these systems do not support VBScript.  I developed this
   script using the excellent vbsedit tool, but this is not required to
   run this script.

   This restriction should disappear when the  Calibre2Web functionality is
   rewritten in a more portable programming language.   This will be required
   if it is to be incorporated as a standard component in future Calibre releases.

2. The Calibre2Web program makes many assumptions about the format of the
   metadata.db file used by Calibre.   If the format of this file changes
   then Calibre2Web will almost certainly stop working correctly until it
   can be updated to handle the new format.


Known Bugs
~~~~~~~~~~
 - Count of Books for Series is wrong if any of the books have multiple authors
   although the correct list of books is shown if you open the series.  Fixing
   this would require major changes to the code so I have not bothered.


Installation & Running
~~~~~~~~~~~~~~~~~~~~~~

1. Copy the files supplied into your preferred location.  Exactly what it is
   not critical, but a good one is the location at which your Calibre library
   is located.   This is the folder that contains the Calibre metadata.db file

2. Download and install a sqlite3 ODBC driver if you do not already have one.
   I used the one from http://www.ch-werner.de/sqliteodbc/

3. Configure an ODBC connection under Control Panel->Administrative Tools->Data Sources (ODBC)
   (you normally need to have administrator rights to run this facility).
        Click on the System DSN tab
        Click on the Add button to create a new entry
        Select the SQLite3 ODBC Driver from the list displayed and press Finish
	This will then pop up a dialog for configuring the new DSN connection.  Set:
	  Data Source Name = CALIBRE
	  Database Name = U:\eBOOKS\Calibre Library\metadata.db  (your path may be different)

4. Run the program you by simply doing an "open" of the Calibre2Web.bat file that
   is supplied.   This contains a 1 line command of:
	cScript Calibre2Web.vbs
   and is provided so that you can the program without the need to first
   open up a command prompt window

   Whichever way you start it a similar looking Command Windows is opened.
   While the program is running it will give progress messages.   You may not
   have time to see these if you are not running from a command window. 

5. When the program is run it will generate a '_CATALOGS' sub-folder at the location
   where Calibre2Web is run from.  This folder contains a set of XML files which are
   catalog files in the "Atom" Syndication format (RFC 4287) format as used by Stanza.
   These will be used later on the web server.

   NOTE: All files in the '_CATALOGS' folder are deleted and then re-created every
         time Calibre2Web is run, so do not use this folder to store any files
         that you wish to keep.
        

If you are worried about what the Calibre2Web program is doing, then
you can open the files in a text editor and view the source directly.


Preparation in Calibre
~~~~~~~~~~~~~~~~~~~~~~

Calibre2Web will only generate entries for books that are in the Calibre
metadata database.  Download links will only be avavilable for those
books for which there is an ePub version.

Therefore it is recommended that the actions you take in Calibre  to get
ready for using Calibre2Web are:

1. Load the Books that are of interest to you into the Calibre
   Library Management system

2. Use Calibre to set up the book metadata and cover art the way that
   you want it.  Calibre gives simple facilities for downloading
   book metadata and cover art from the Internet if they are not
   already present in the book when you load it into Calibre

3. convert your books into ePub format within Calibre.  Stanza
   on the iPhone/iTouch can only use ePub format books.

4. When you are ready run the Calibre2Web program.   This will
   generate a set of catalog files (in XML format) within the
   _CATALOGS sub-folder of the Calibre2Web program location.

5. If you add new books to your Calibre library or edit the metadata
   or Cover Art then you will need to repeat the process of running
   Calibre2Web to generate the updated catalogs.


Preparing Your Web Site
~~~~~~~~~~~~~~~~~~~~~~~

The files generated by Calibre2Web are intended to be hosted on a web
server.   In most cases this will be a private Web server although there
is nothing that mandates this. 

There is no specific location mandated for the files.   The catalog files
are generated to use relative URL's so as long as you maintain the folder
structure generated by Calibre and Calibre2Web then the exact location
does not matter. 

Basically you need to have on your web server:

-  The Calibre ePub files and cover art files.    You must maintain the folder
   structure and file names that has been generated by Calibre within the 
   Calibre library.   You could simply copy all the files in the Calibre
   library folder, but that may well be more than you need if you have
   eBook files present for formats other than ePub.

-  The _CATALOGS folder and xml files that is generated by Calibre2Web.  

You can use whatever software you normally use to maintain your web site
to copy the files across.   Typically this would be some sort of FTP
application.

If you are hosting it on a private web server, then a good approach is to
make the location where your store your Calibre library visible directly
to the web server.   As an example in my case I have my Calibre library
stored on a NAS networked file server.  I keep the _CATALOGS folder and
generated XML files in the Calibre Library folder.  On that same server 
I am running a web server which can see  the Calibre files.  Therefore 
once I have run the Calibre2Web command to generate the catalog files 
they immediately become visible to Stanza on my iPhone.  This avoids
having to copy any files across to the web server.


Adding the Catalog to the iPhone/iTouch
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

To add your newly generated set of catalogs to stanza on your iPhone/iTouch
you need to do the following:

1. Load Stanza on the iPhone/iTouch

2. Select the Online Catalogs option

3. Press the "+" button to take you into the dialog for adding a new catalog.
   Then select the "Add Stanza Catalog" option (not the Add Web Page option).

4. Give your new catalog any name that you like.
   The URL entry must be the path to the catalog.xml file within the
   catalogs folder generated by Calibre2Web.   This is the top
   level of your new private catalog
   	e.g. http://mywebserver/stanza/_CATALOGS/catalog.xml
   although the exact location under your web server folder heirarchy might 
   be different

5. Confirm the settings and your new catalog will appear on the list
   of online catalogs

6. Select this catalog and your books should appear.   If they
   do not, check that you have your web server running and that you
   have the correct URL defined for you online catalog.


Web Server Security
~~~~~~~~~~~~~~~~~~~

If you are hosting the new catalog on your private web server and you want
to be able to access it away from home, then you need to look at how to
handle port forwarding from the public Internet to your private web server.
This normally involves setting up NAT (Network Address Translation) entries
on your router.  However the exact details can vary according to the type
of system and network you have.   However if you can make it work for normal
web pages then it will also work with Stanza.

You should be careful about hosting your private eBook library on a public 
web server as you may inadvertently end up breaking copyright regulations.
Also if you are hosting it on a private web server but setting up access to
this private web server from the public Internet you should have some sort
of security in place for the same reason.  Normally this would involve
configuring your web server so that any attempt to access the folder
holding your library results in a Username/Password prompt.


Trouble Shooting
~~~~~~~~~~~~~~~~

This section is intended to give guidance on any common errors

1. If a book appears in Calibre with an ePub format, but does not appear in
   the list of titles in Stanza, then it is possible that you started to
   convert the book to ePub format, but did not complete successfully.
   Rerunning the conversion to ePub should fix this problem

2. If a particular book has cover art when viewed from Calibre, but not when
   shown in the Stanza catalog, then rerun the conversion to ePub.

3. Depending on the web server you use filenames may be case significant.
   calibre2Web uses the case stored in the Calibre metadata.db file, so
   make sure you do not inadvertently chnge the case when copying file
   onto your web server.

4. If you get database related errors when running Calibre2Web, then make
   sure that Calibre is not also running as it may be locking the database.


IDEAS/TODO
~~~~~~~~~~

These are some of the ideas I have for improvement.   There is no
committment to providing them.  How far I go will depend on both my
personal needs and also requests from users of the current version
of the software.

1. MOST IMPORTANT
   Rewrite the logic into a more portable language (e.g. Python or perl)
   so that it can be run on all the platforms currently supported
   by Calibre.

2. See if the logic an be transpose into Python syntax so that it can
   be submitted for inclusion in the standard Calibre releases at some
   future date.  It is still not clear whether it is better to go
   down this route or keep Calibre2Web as a free-standing capability.

3. The Categories functionality is driven by Calibre tags.
   Might need a way to determine which tags identify genres, and
   which ones do not, although initially all tags would be treated
   equally.

4. See if there is any way to generalise the catalog generation
   process so that users can give their own personal structure.
   This may not be that easy!

5. Change the logic around so that catalogs can be generated dynamically
   on demand rather than statically as in the current version.  The
   option to generate static catalogs would be maintained for those
   who need it.

6. Implement Search capability.   This would be reliant on the move to
   a dynamic mode of catalog generation as static catlogs do not lend
   themselves to search

7. Support clients that require something other than the Atom format
   used by Stanza for catalogs or Epub for the eBooks.  This might
   require delivering the catalogs in another format (e.g. HTML) or 
   eBooks that are in formats other than EpuB.  It might also be
   possible to use the current XML files in conjunction with stylesheets
   to transform them to HTML.


Useful Links:
~~~~~~~~~~~~

   Calibre2Web home Page
	http://homepage.ntlworld.com/itimpi/ebooks.htm

   Stanza Home Page
	http://www.lexcycle.com/

   Stanza Support Forums
	http://www.lexcycle.com/forum/stanza

   Calibre Home Page
	http://calibre.kovidgoyal.net/

   Calibre Support Forums
	http://www.mobileread.com/forums/forumdisplay.php?f=166


Change History
~~~~~~~~~~~~~~

 22 Nov 2008    itimpi  0.1     First version

 01 Dec 2008    itimpi  0.2     Added: Catalog by letter
                                Added: Counts within catalogs
                                Added: Size details at Book level
                                Added: If any set, Tag details at Book level
                                Added: If any set, Series Name and Index at Book level
                                Added: Individual author level now shows all; series; none series
                                Fixed: Empty catalog error messages when no EPUB format ebook

 07 Dec 2008    itimpi  0.3     Change: Name changed to Calibre2Web to more accurately reflect
                                        functionality and likely future development direction

 13 Dec 2008    itimpi  0.4     Change: All Books in database are now included in listings
                                        even if there is no EPUB format present
                                Change: Books with no EPUB format have no download link
                                Change: Books with no cover.jpg file have no cover link"
                                Fix:    Format of .EPUB filename in Calibre library for books
                                        with multiple authors
                                Added:  Books with no EPUB format and PAPER tag have "[P]"
                                        appended to their title.
                                Added:  Books with "WISHLIST" or "WANTED" in tags now have "[W]"
                                        appended to their title
                                Added:  Books tagged as Omnibus now have "(Omnibus)" appended to
                                        their title, and have no series ID shown
                                Added:  Support for Categories (Tags)
                                Change: Book Size no longer shown under book info

 20 Dec 2008    itimpi  0.5     Added:  Large Catalogs (> MAX_CATALOG books) are now divided
                                        an extra level by the start letter
                                Fix:	Excaped special characters in title
                                Fix:    Escape special characters in Series name
                                Change: Generation of the "All" option under an author is
                                        suppressed if their are no books in a series
                                Change: The 'All' option is only generated under an Author.  No longer
                                        done at the Authors, Series, Titles, Categories levels.

 11 Jan 2009    itimpi  0.6     Added:	Recent Additions option at top level
                                Added:  Pagination within Recent Additions list to improve loading
                                        time when not using WiFi, and to prove concept works OK.

 24 Jan 2009    itimpi  0.7     Added:  Counts to sub-text on top level screen
 
 16 Mar 2009    itimpi  0.8     Fix:    Fixed mispelt variable name which would cause runtime error.
 
 04 Apr 2009    itimpi  0.9     Fix     fixed problem with Large categories which were not generating 
                                        the sub-catalogs by letter correctly
                                        
 23 May 2009    itimpi  0.91    Fix:    When URL's contained non-ASCII characters they needed to be escaped
                                Added:  Total run time now displayed at end of run.
 
 06 Jun 2009    itimpi  0.92    Fix     Special characters in tags were not being escaped
 
 07 Jun 2009    itimpi  0.93    Fix     Character '&' in Title was not being correctly escaped

 07 Jun 2009    itimpi  0.94    Fix     Character '&' in Categories was not being correctly escaped
 
 14 Jul 2009    itimpi  0.95    Fix     Title was not displaying author name for the "All Books" option
                                        under an author when displaying an Authors books.

 12 Aug 2009    itimpi  0.96    Fix     Special characters in Author Names were not escaped when listing their books.

 15 Nov 2009    itimpi  0.97    Fix     If a title contained "(" character the .epub file was not listed.
 
 28 Feb 2010	itimpi	0.98	Fix:	Unicode characters in XML were not being escaped correctly
 				Added	Can specify that catalog folder is not to be cleared at start of run
 					This reduces run time at cost of possibly leaving superflous files behind. 
 				Fix	Some paths were relative that should be absolute.