Quote:
Originally Posted by minstrel
My overall plan is to write a Calibre plugin that takes the HTML input on importing and does some very basic, regular-expression based substitution on the file so some Browser specific tags can be kicked out, and possibly some CSS-constructs not supported by epub may be replaced. I have seen similar ideas floating around on this forum, but nobody (yet) seems to have tried to address it in the form of a Calibre plugin.
|
That's a good idea, if you come up with a reasonably general plugin, I'll be happy to distribute it with calibre.
Quote:
My results (about three hours of playing around with writing plugins) have been mixed. Here are my findings so far (maybe the first three should go into the documentation):
1) The plugin name must end with _plugin.py, otherwise Calibre won't find it in the zip container.
|
Yeah, the documentation does specify that this is the case.
Quote:
2) When writing plugins on a windows machine I had to be sure to save the .py file in UNIX format. Windows/DOS format will not work, when adding such a plugin to Calibre it will choke right on line 1 with a weird error.
|
Hmm calibre just executes the plugin as a normal python module using the python interpreter, so this shouldn't be a problem, but I'll implement a fix for it in the next release.
Quote:
3) The Hello World example differs between the example given on the web page and the one available as a file download (from the same page):
|
Oops, Now fixed.
Quote:
5) Why does the example use a FileTypePlugin when it actually modifies the Metadata? Shouldn't it rather be a Metadata Plugin?
|
Creating a metadata plugin would mean that the publisher is overridden in *all* contexts, whereas using a filetype plugin you can ensure that it is only overriden on import or preprocess or whatever.
Quote:
6) I tried to write a simple plugin which simply replaces the word "Hello" by "World". I can nicely install it in Calibre, alas, it won't do anything. Is my approach completely wrong (do I have to do some XML tree processing, do I have to do something special to get to the content, is the temporary_file() method used correctly?):
|
You shouldn't have to, but the case of HTML import is a little complicated because the HTML file is transformed into a ZIP file on import by another plugin, so perhaps the two plugins are interacting to cause this problem. I'll look into it in a little bit.
As a general note you should probably change your plugin to run on preprocess rather than on import, Since people dont always use the calibre GUI to do conversions.