Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Software > Calibre > Plugins

Notices

Reply
 
Thread Tools Search this Thread
Old 04-12-2009, 06:19 AM   #1
minstrel
Junior Member
minstrel began at the beginning.
 
minstrel's Avatar
 
Posts: 3
Karma: 10
Join Date: Apr 2009
Device: Sony PRS-505
Calibre plugin development - Newbie problems

Although being an old hand in providing ebooks to PG via Distributed Proofreaders, I only recently got into converting and reading them on an actual ebook reader (Sony PRS-505). Naturally, I want to look the books the best they can on the reader, so I mused a bit about doing some additional processing when feeding the HTML files through Calibre.

My overall plan is to write a Calibre plugin that takes the HTML input on importing and does some very basic, regular-expression based substitution on the file so some Browser specific tags can be kicked out, and possibly some CSS-constructs not supported by epub may be replaced. I have seen similar ideas floating around on this forum, but nobody (yet) seems to have tried to address it in the form of a Calibre plugin.

My results (about three hours of playing around with writing plugins) have been mixed. Here are my findings so far (maybe the first three should go into the documentation):

1) The plugin name must end with _plugin.py, otherwise Calibre won't find it in the zip container.

2) When writing plugins on a windows machine I had to be sure to save the .py file in UNIX format. Windows/DOS format will not work, when adding such a plugin to Calibre it will choke right on line 1 with a weird error.

3) The Hello World example differs between the example given on the web page and the one available as a file download (from the same page):

example page: set_metadata(file, mi, ext)
downloadable plugin: set_metadata(file, ext, mi)

Note the differing argument order (my Python knowledge is still limited, and I know that you can have a random order of arguments when naming them, however I don't think this applies here).

4) I managed to install the HelloWorld plugin in Calibre, but it didn't seem to do anything for me (Both versions, Calibre run under Vista, using German localization)

5) Why does the example use a FileTypePlugin when it actually modifies the Metadata? Shouldn't it rather be a Metadata Plugin?

6) I tried to write a simple plugin which simply replaces the word "Hello" by "World". I can nicely install it in Calibre, alas, it won't do anything. Is my approach completely wrong (do I have to do some XML tree processing, do I have to do something special to get to the content, is the temporary_file() method used correctly?):

Code:
import os, re
from calibre.customize import FileTypePlugin

class CleanupLitPlugin(FileTypePlugin):

  name                = 'Regular Expression plugin' # Name of the plugin
  description         = 'Apply Regular Expression to input'
  supported_platforms = ['windows', 'osx', 'linux'] # Platforms this plugin will run on
  author              = 'Markus Brenner' # The author of this plugin
  version             = (1, 0, 0)   # The version number of this plugin
  file_types          = set(['html']) # The file types that this plugin will be applied to
  on_import           = True
  
  def run(self, path_to_ebook):
    file = open(path_to_ebook, 'r+b')
    outfile = temporary_file("mab")
    for line in file:
      output = re.sub(r'Hello',r'World',line)
      outfile.write(output)
    return outfile.name
7) Is there a way to debug Calibre plugins, like writing some debugging text to a console when the run() method is called?

Any hints what I did wrong would be very welcome! (And I have a feeling other people would benefit, too).

Thanks,
-markus
minstrel is offline   Reply With Quote
Old 04-12-2009, 10:34 AM   #2
minstrel
Junior Member
minstrel began at the beginning.
 
minstrel's Avatar
 
Posts: 3
Karma: 10
Join Date: Apr 2009
Device: Sony PRS-505
Quote:
Originally Posted by minstrel View Post
7) Is there a way to debug Calibre plugins, like writing some debugging text to a console when the run() method is called?
A minor update: Reading up some more in an older thread I figured out I can launch "calibredb add [myfile.html]" from the command line, thus enabling nice print debug output from my python plugin.

Thus I nicely was able to verify that my plugin gets called from the importer, and the regular expression works. I even get to see the temporary file name. However, with nicely returning the tempfile's name I still get the original text in my ebook, not the substituted one.

What am I doing wrong?

Code:
import os, re
from calibre.customize import FileTypePlugin

class CleanupLitPlugin(FileTypePlugin):

  name                = 'Regular Expression plugin' # Name of the plugin
  description         = 'Apply Regular Expression to input'
  supported_platforms = ['windows', 'osx', 'linux'] # Platforms this plugin will run on
  author              = 'Markus Brenner' # The author of this plugin
  version             = (1, 0, 0)   # The version number of this plugin
  file_types          = set(['html', 'epub']) # The file types that this plugin will be applied to
  on_import           = True
  
  def run(self, path_to_ebook):
    file = open(path_to_ebook, 'r+b')
    outfile = self.temporary_file(".html")
    for line in file:
      output = re.sub(r'Hello',r'World',line)
      print output
      outfile.write(output)
    outfile.close()
    print outfile.name
    return outfile.name
Output:

Code:
C:\Users\mab\Documents\Calibre-test\HMTL>calibredb add test.html
Loading plugin from C:\Users\mab\AppData\Roaming\calibre\plugins\Regular Express
ion plugin.zip
<html>

<body>

<p>World, good morning.</p>

</body>

</html>
c:\users\mab\appdata\local\temp\calibre_0.5.6_9qee2x.html
Building file list...
        Parsing Calibre-test\HMTL\test.html
Open ebook created in c:\users\mab\appdata\local\temp\calibre_0.5.6_13qosd_create_oebzip
Output saved to c:\users\mab\appdata\local\temp\calibre_0.5.6_svedje_plugin_html
2zip.zip
minstrel is offline   Reply With Quote
Advert
Old 04-12-2009, 10:52 AM   #3
user_none
Sigil & calibre developer
user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.
 
user_none's Avatar
 
Posts: 2,488
Karma: 1063785
Join Date: Jan 2009
Location: Florida, USA
Device: Nook STR
Quote:
Originally Posted by minstrel
Output:
Code:
<p>World, good morning.</p>
You were trying to replace Hello with World, so isn't that output correct?
user_none is offline   Reply With Quote
Old 04-12-2009, 11:58 AM   #4
minstrel
Junior Member
minstrel began at the beginning.
 
minstrel's Avatar
 
Posts: 3
Karma: 10
Join Date: Apr 2009
Device: Sony PRS-505
Quote:
Originally Posted by user_none View Post
You were trying to replace Hello with World, so isn't that output correct?
Well, this was the debug output on the console, when i view the actual book in Calibre's library the original string is in there.

Bottom line: My code nicely gets called, the regular expression substitution works, but the replace text file (temp file) doesn't seem to make it back into the library for further conversion.
minstrel is offline   Reply With Quote
Old 04-12-2009, 12:30 PM   #5
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 43,842
Karma: 22666666
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
Quote:
Originally Posted by minstrel View Post
My overall plan is to write a Calibre plugin that takes the HTML input on importing and does some very basic, regular-expression based substitution on the file so some Browser specific tags can be kicked out, and possibly some CSS-constructs not supported by epub may be replaced. I have seen similar ideas floating around on this forum, but nobody (yet) seems to have tried to address it in the form of a Calibre plugin.
That's a good idea, if you come up with a reasonably general plugin, I'll be happy to distribute it with calibre.

Quote:
My results (about three hours of playing around with writing plugins) have been mixed. Here are my findings so far (maybe the first three should go into the documentation):

1) The plugin name must end with _plugin.py, otherwise Calibre won't find it in the zip container.
Yeah, the documentation does specify that this is the case.

Quote:
2) When writing plugins on a windows machine I had to be sure to save the .py file in UNIX format. Windows/DOS format will not work, when adding such a plugin to Calibre it will choke right on line 1 with a weird error.

Hmm calibre just executes the plugin as a normal python module using the python interpreter, so this shouldn't be a problem, but I'll implement a fix for it in the next release.

Quote:
3) The Hello World example differs between the example given on the web page and the one available as a file download (from the same page):

Oops, Now fixed.

Quote:
5) Why does the example use a FileTypePlugin when it actually modifies the Metadata? Shouldn't it rather be a Metadata Plugin?
Creating a metadata plugin would mean that the publisher is overridden in *all* contexts, whereas using a filetype plugin you can ensure that it is only overriden on import or preprocess or whatever.

Quote:
6) I tried to write a simple plugin which simply replaces the word "Hello" by "World". I can nicely install it in Calibre, alas, it won't do anything. Is my approach completely wrong (do I have to do some XML tree processing, do I have to do something special to get to the content, is the temporary_file() method used correctly?):
You shouldn't have to, but the case of HTML import is a little complicated because the HTML file is transformed into a ZIP file on import by another plugin, so perhaps the two plugins are interacting to cause this problem. I'll look into it in a little bit.

As a general note you should probably change your plugin to run on preprocess rather than on import, Since people dont always use the calibre GUI to do conversions.
kovidgoyal is offline   Reply With Quote
Advert
Old 04-12-2009, 12:44 PM   #6
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 43,842
Karma: 22666666
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
Try setting priority=100 in your plugin to make sure it runs before the HTML2ZIP plugin
kovidgoyal is offline   Reply With Quote
Reply


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
DR800 Plugin development for DR800/DR1000 yuri_b iRex Developer's Corner 0 09-18-2010 09:46 AM
Problems with MobiDeDrm 0.16 plugin boothy Plugins 4 08-30-2010 08:48 AM
Mobi Dedrm plugin problems FourGold Plugins 9 08-01-2010 01:34 PM
Device plugin development reader42 Plugins 10 03-29-2010 12:39 PM
Calibre/PRS-505 Newbie having problems.. ladymaverick Calibre 6 12-29-2008 09:25 AM


All times are GMT -4. The time now is 11:05 PM.


MobileRead.com is a privately owned, operated and funded community.