12-02-2009, 04:58 PM | #1 |
Connoisseur
Posts: 91
Karma: 108
Join Date: Jan 2008
Device: Palm Treo 680, Sony Reader
|
Including images from PML
I'm trying to convert some old eReader projects to ePub and noticed that Calibre wants images in a different folder than DropBook. According to DropBook, the images should be stored in bookname_img directory. However, Calibre expects them in a "images" directory:
Code:
(re.compile(r'\\m="(?P<name>.+?)"'), lambda match: '<img src="images/%s" />' % image_name(match.group('name')).strip('\x00')), - Jim |
12-02-2009, 06:11 PM | #2 |
Sigil & calibre developer
Posts: 2,488
Karma: 1063785
Join Date: Jan 2009
Location: Florida, USA
Device: Nook STR
|
calibre isn't dropbook. The reason I went with a folder called images is because I feel that the "bookname"_img directory requirement is ridiculous.
|
12-02-2009, 06:43 PM | #3 |
Connoisseur
Posts: 91
Karma: 108
Join Date: Jan 2008
Device: Palm Treo 680, Sony Reader
|
Aw, come on. How hard can it be to make sure that Calibre is layout-compatible with all external eBook-creation applications?
Alright, fair enough. I can work around this easily enough. - Jim |
12-02-2009, 07:13 PM | #4 | |
Sigil & calibre developer
Posts: 2,488
Karma: 1063785
Join Date: Jan 2009
Location: Florida, USA
Device: Nook STR
|
Quote:
What you are seeing is the translation of the PML into HTML where the images are moved into an images folder. Oh and from the code you mentioned is not from the latest version. I rewrote the PML parser. It is much faster and does a better translation. It is in 0.6.25. |
|
12-02-2009, 08:09 PM | #5 |
Sigil & calibre developer
Posts: 2,488
Karma: 1063785
Join Date: Jan 2009
Location: Florida, USA
Device: Nook STR
|
I've committed a few changes for handing images with regard to PML input.
PMLZ can have images in ./ (top level), in archivename_img/, or images/ directories within the archive. PML can have images in bookname_img/ or images/ directories that are in the same location as the PML file. One thing to note is the first location in the above order that contains PNG files will be the only location used. Also, bookname in bookname_img is the name of name of the PML file without the extension not the name set in the metadata. |
12-03-2009, 03:49 AM | #6 |
Connoisseur
Posts: 91
Karma: 108
Join Date: Jan 2008
Device: Palm Treo 680, Sony Reader
|
Okay, I've modified my script so that it dumps the images into the root level when it is generating a pmlz archive. DropBook can't handle a zip archive anyways. Progress! Unfortunately, Calibre fails to read Metadata when I convert the PDB to PMLZ on adding, but a quick Metadata re-read fixes that.
Now it's time to start picking at that PML to HTML generator. I want to apply a couple tweaks. - Jim |
12-03-2009, 04:20 AM | #7 |
Connoisseur
Posts: 91
Karma: 108
Join Date: Jan 2008
Device: Palm Treo 680, Sony Reader
|
Okay! This seems to be working for me. I can now extract the cover image from the PMLZ archive. I modified metadata/pml.py as follows:
Code:
def get_metadata(stream, extract_cover=True): """ Return metadata as a L{MetaInfo} object """ mi = MetaInformation(_('Unknown'), [_('Unknown')]) stream.seek(0) pml = '' if stream.name.endswith('.pmlz'): with TemporaryDirectory('_unpmlz') as tdir: zf = ZipFile(stream) zf.extractall(tdir) pmls = glob.glob(os.path.join(tdir, '*.pml')) for p in pmls: with open(p, 'r+b') as p_stream: pml += p_stream.read() coverpath = glob.glob(os.path.join(tdir, 'cover.png')) if len(coverpath)==0: imagedir=os.path.join(tdir,'images') coverpath = glob.glob(os.path.join(imagedir, 'cover.png')) if len(coverpath)==0: imagedir=os.path.join(tdir,pmls[0] + '_img') coverpath = glob.glob(os.path.join(imagedir, 'cover.png')) if len(coverpath)>0: coverdata = open(coverpath, 'rb').read() if coverdata is not None: mi.cover_data = ('png', coverdata) else: - Jim |
12-03-2009, 03:16 PM | #8 |
Connoisseur
Posts: 91
Karma: 108
Join Date: Jan 2008
Device: Palm Treo 680, Sony Reader
|
Oops! I forgot. You use the archive name instead of the *.pml name to create the *_img file. Hrm....I suppose you'd have to use "stream.name" instead.
Truth be told, using archivename_img is pretty useless since archivename.zip and bookname.pml will rarely match. The image directory should really be based on the name of the first *.pml file. - Jim |
12-03-2009, 06:23 PM | #9 |
Sigil & calibre developer
Posts: 2,488
Karma: 1063785
Join Date: Jan 2009
Location: Florida, USA
Device: Nook STR
|
The only way the PML won won't match is if you name it that way. If we are going by how DropBook handles images, the fact that pmlz supports multiple PML files and DropBook does not makes the whole image directory naming even more arbitrary. Considering that name_img is arbitrary in itself and PMLZ already supports toplevel, archive_img/, and images/ making sure you name the archive the same as the first images folder if you want to do DropBook style arrangement is not a ridiculous requirement.
|
12-03-2009, 07:21 PM | #10 |
Connoisseur
Posts: 91
Karma: 108
Join Date: Jan 2008
Device: Palm Treo 680, Sony Reader
|
True enough. I'm just saying that it may be unreliable to the point that it's not worth the effort to add support for that. Frankly, like you, I wonder why DropBook requires images to be in a separate directory to begin with. It's an unnecessary complication.
Originally, the directory conflict bugged me because I thought importing a *.pml file would collect the images into a *.pmlz file in much the same way that importing a *.html file does. Since that's not the case, and I have to bring in either a PDB or a PMLZ (neither of which is useful to DropBook anyways), I may as well create them to Calibre's standard. I looked into adding the ability to import the Cover Page from a PDB file, but I don't see an easy way to do it since you have to partially decode the file to get the "cover.png" image. At this point, I've created a preprocessing plugin (on add to database) for any eReader PDB files that returns a *.pmlz file. This way I have PML text properly cleaned. I then just tell it to re-read the meta-data and all of the details show up. With the added code, the cover page is pulled in as well (using calibre-debug). Man, Calibre has become 10 times more useful since adding eReader support. Thanks for all the work you put into this! - Jim |
12-03-2009, 07:45 PM | #11 | |
Sigil & calibre developer
Posts: 2,488
Karma: 1063785
Join Date: Jan 2009
Location: Florida, USA
Device: Nook STR
|
Quote:
I'll add support for this too. Only two parts would need to be read. The PDB header to get the location of the eReader header (record 0) and from that get the sections that are images. In every case I've seen (I will have to make sure calibre conforms) the cover.png is the first image if there is a cover.png. It should be less processing power than extracting the image from a PMLZ archive. |
|
12-04-2009, 12:25 AM | #12 | |
Connoisseur
Posts: 91
Karma: 108
Join Date: Jan 2008
Device: Palm Treo 680, Sony Reader
|
Quote:
You'll have to loop until you match "cover.png." Will you be adding cover metadata support to *.pmlz as I've shown, or do I need to get off my lazy butt and submit through bazarr? Granted, I need to tidy it up a bit and add the "extract_cover" condition. - Jim |
|
12-04-2009, 06:16 AM | #13 |
Sigil & calibre developer
Posts: 2,488
Karma: 1063785
Join Date: Jan 2009
Location: Florida, USA
Device: Nook STR
|
I added a support for this the other day. It takes a bit for Kovid to merge the changes from my branch into trunk. It should be there now.
|
12-05-2009, 01:52 AM | #14 |
Connoisseur
Posts: 91
Karma: 108
Join Date: Jan 2008
Device: Palm Treo 680, Sony Reader
|
Man, I keep creating yeterday's news. Thanks a bunch! I'll update bazarr in a couple days and try it out.
- Jim |
Thread Tools | Search this Thread |
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
Including book cover when adding books | Keith0603 | Calibre | 11 | 05-23-2011 12:10 AM |
Buy Ereader for $100 including shipping | 901hw | Flea Market | 5 | 08-23-2010 09:00 PM |
Problems with PML Conversions - Is it a bug? | JulianL | Calibre | 10 | 07-18-2010 04:15 PM |
Is it okay to import PML directly into Calibre? | ficbot | Calibre | 3 | 05-02-2010 07:37 PM |
Cover images for PML/PDB -- size? | Jon Noring | Workshop | 0 | 08-13-2008 07:35 PM |