Register Guidelines E-Books Search Today's Posts Mark Forums Read

Go Back   MobileRead Forums > E-Book Software > Calibre

Notices

Reply
 
Thread Tools Search this Thread
Old 12-02-2009, 04:58 PM   #1
macr0t0r
Connoisseur
macr0t0r doesn't littermacr0t0r doesn't litter
 
macr0t0r's Avatar
 
Posts: 91
Karma: 108
Join Date: Jan 2008
Device: Palm Treo 680, Sony Reader
Including images from PML

I'm trying to convert some old eReader projects to ePub and noticed that Calibre wants images in a different folder than DropBook. According to DropBook, the images should be stored in bookname_img directory. However, Calibre expects them in a "images" directory:
Code:
    (re.compile(r'\\m="(?P<name>.+?)"'), lambda match: '<img src="images/%s" />' % image_name(match.group('name')).strip('\x00')),
Is there a reason for this discrepancy, or should I file this as a bug/feature request?

- Jim
macr0t0r is offline   Reply With Quote
Old 12-02-2009, 06:11 PM   #2
user_none
Sigil & calibre developer
user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.
 
user_none's Avatar
 
Posts: 2,488
Karma: 1063785
Join Date: Jan 2009
Location: Florida, USA
Device: Nook STR
calibre isn't dropbook. The reason I went with a folder called images is because I feel that the "bookname"_img directory requirement is ridiculous.
user_none is offline   Reply With Quote
Old 12-02-2009, 06:43 PM   #3
macr0t0r
Connoisseur
macr0t0r doesn't littermacr0t0r doesn't litter
 
macr0t0r's Avatar
 
Posts: 91
Karma: 108
Join Date: Jan 2008
Device: Palm Treo 680, Sony Reader
Aw, come on. How hard can it be to make sure that Calibre is layout-compatible with all external eBook-creation applications?

Alright, fair enough. I can work around this easily enough.

- Jim
macr0t0r is offline   Reply With Quote
Old 12-02-2009, 07:13 PM   #4
user_none
Sigil & calibre developer
user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.
 
user_none's Avatar
 
Posts: 2,488
Karma: 1063785
Join Date: Jan 2009
Location: Florida, USA
Device: Nook STR
Quote:
Originally Posted by macr0t0r View Post
Aw, come on. How hard can it be to make sure that Calibre is layout-compatible with all external eBook-creation applications?

Alright, fair enough. I can work around this easily enough.

- Jim
Actually, looking at the code again calibre does not use an image directory at all for PML input. If you want images you need to have all .pml and .png files it in a zip file with the extension .pmlz. Straight PML input does not support images, only pmlz does.

What you are seeing is the translation of the PML into HTML where the images are moved into an images folder.

Oh and from the code you mentioned is not from the latest version. I rewrote the PML parser. It is much faster and does a better translation. It is in 0.6.25.
user_none is offline   Reply With Quote
Old 12-02-2009, 08:09 PM   #5
user_none
Sigil & calibre developer
user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.
 
user_none's Avatar
 
Posts: 2,488
Karma: 1063785
Join Date: Jan 2009
Location: Florida, USA
Device: Nook STR
I've committed a few changes for handing images with regard to PML input.

PMLZ can have images in ./ (top level), in archivename_img/, or images/ directories within the archive.

PML can have images in bookname_img/ or images/ directories that are in the same location as the PML file.

One thing to note is the first location in the above order that contains PNG files will be the only location used. Also, bookname in bookname_img is the name of name of the PML file without the extension not the name set in the metadata.
user_none is offline   Reply With Quote
Old 12-03-2009, 03:49 AM   #6
macr0t0r
Connoisseur
macr0t0r doesn't littermacr0t0r doesn't litter
 
macr0t0r's Avatar
 
Posts: 91
Karma: 108
Join Date: Jan 2008
Device: Palm Treo 680, Sony Reader
Okay, I've modified my script so that it dumps the images into the root level when it is generating a pmlz archive. DropBook can't handle a zip archive anyways. Progress! Unfortunately, Calibre fails to read Metadata when I convert the PDB to PMLZ on adding, but a quick Metadata re-read fixes that.

Now it's time to start picking at that PML to HTML generator. I want to apply a couple tweaks.

- Jim
macr0t0r is offline   Reply With Quote
Old 12-03-2009, 04:20 AM   #7
macr0t0r
Connoisseur
macr0t0r doesn't littermacr0t0r doesn't litter
 
macr0t0r's Avatar
 
Posts: 91
Karma: 108
Join Date: Jan 2008
Device: Palm Treo 680, Sony Reader
Okay! This seems to be working for me. I can now extract the cover image from the PMLZ archive. I modified metadata/pml.py as follows:
Code:
def get_metadata(stream, extract_cover=True):
    """ Return metadata as a L{MetaInfo} object """
    mi = MetaInformation(_('Unknown'), [_('Unknown')])
    stream.seek(0)

    pml = ''
    if stream.name.endswith('.pmlz'):
        with TemporaryDirectory('_unpmlz') as tdir:
            zf = ZipFile(stream)
            zf.extractall(tdir)

            pmls = glob.glob(os.path.join(tdir, '*.pml'))
            for p in pmls:
                with open(p, 'r+b') as p_stream:
                    pml += p_stream.read()
            coverpath = glob.glob(os.path.join(tdir, 'cover.png'))
            if len(coverpath)==0:
                imagedir=os.path.join(tdir,'images')
                coverpath = glob.glob(os.path.join(imagedir, 'cover.png'))
            if len(coverpath)==0:
                imagedir=os.path.join(tdir,pmls[0] + '_img')
                coverpath = glob.glob(os.path.join(imagedir, 'cover.png'))
            if len(coverpath)>0:
                coverdata = open(coverpath, 'rb').read()
                if coverdata is not None:
                    mi.cover_data = ('png', coverdata)
    else:
Try it out and see if that works for you as well (I've only tested it on a Mac). If so, would you so kindly push that in?

- Jim
macr0t0r is offline   Reply With Quote
Old 12-03-2009, 03:16 PM   #8
macr0t0r
Connoisseur
macr0t0r doesn't littermacr0t0r doesn't litter
 
macr0t0r's Avatar
 
Posts: 91
Karma: 108
Join Date: Jan 2008
Device: Palm Treo 680, Sony Reader
Oops! I forgot. You use the archive name instead of the *.pml name to create the *_img file. Hrm....I suppose you'd have to use "stream.name" instead.

Truth be told, using archivename_img is pretty useless since archivename.zip and bookname.pml will rarely match. The image directory should really be based on the name of the first *.pml file.

- Jim
macr0t0r is offline   Reply With Quote
Old 12-03-2009, 06:23 PM   #9
user_none
Sigil & calibre developer
user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.
 
user_none's Avatar
 
Posts: 2,488
Karma: 1063785
Join Date: Jan 2009
Location: Florida, USA
Device: Nook STR
Quote:
Originally Posted by macr0t0r View Post
Truth be told, using archivename_img is pretty useless since archivename.zip and bookname.pml will rarely match.
The only way the PML won won't match is if you name it that way. If we are going by how DropBook handles images, the fact that pmlz supports multiple PML files and DropBook does not makes the whole image directory naming even more arbitrary. Considering that name_img is arbitrary in itself and PMLZ already supports toplevel, archive_img/, and images/ making sure you name the archive the same as the first images folder if you want to do DropBook style arrangement is not a ridiculous requirement.
user_none is offline   Reply With Quote
Old 12-03-2009, 07:21 PM   #10
macr0t0r
Connoisseur
macr0t0r doesn't littermacr0t0r doesn't litter
 
macr0t0r's Avatar
 
Posts: 91
Karma: 108
Join Date: Jan 2008
Device: Palm Treo 680, Sony Reader
True enough. I'm just saying that it may be unreliable to the point that it's not worth the effort to add support for that. Frankly, like you, I wonder why DropBook requires images to be in a separate directory to begin with. It's an unnecessary complication.

Originally, the directory conflict bugged me because I thought importing a *.pml file would collect the images into a *.pmlz file in much the same way that importing a *.html file does. Since that's not the case, and I have to bring in either a PDB or a PMLZ (neither of which is useful to DropBook anyways), I may as well create them to Calibre's standard.

I looked into adding the ability to import the Cover Page from a PDB file, but I don't see an easy way to do it since you have to partially decode the file to get the "cover.png" image. At this point, I've created a preprocessing plugin (on add to database) for any eReader PDB files that returns a *.pmlz file. This way I have PML text properly cleaned. I then just tell it to re-read the meta-data and all of the details show up. With the added code, the cover page is pulled in as well (using calibre-debug).

Man, Calibre has become 10 times more useful since adding eReader support. Thanks for all the work you put into this!

- Jim
macr0t0r is offline   Reply With Quote
Old 12-03-2009, 07:45 PM   #11
user_none
Sigil & calibre developer
user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.
 
user_none's Avatar
 
Posts: 2,488
Karma: 1063785
Join Date: Jan 2009
Location: Florida, USA
Device: Nook STR
Quote:
Originally Posted by macr0t0r View Post
Originally, the directory conflict bugged me because I thought importing a *.pml file would collect the images into a *.pmlz file in much the same way that importing a *.html file does.
I will add support for this. It's not there currently because I never though of it because I don't use the GUI.

Quote:
Originally Posted by macr0t0r View Post
I looked into adding the ability to import the Cover Page from a PDB file, but I don't see an easy way to do it since you have to partially decode the file to get the "cover.png" image.
I'll add support for this too. Only two parts would need to be read. The PDB header to get the location of the eReader header (record 0) and from that get the sections that are images. In every case I've seen (I will have to make sure calibre conforms) the cover.png is the first image if there is a cover.png. It should be less processing power than extracting the image from a PMLZ archive.
user_none is offline   Reply With Quote
Old 12-04-2009, 12:25 AM   #12
macr0t0r
Connoisseur
macr0t0r doesn't littermacr0t0r doesn't litter
 
macr0t0r's Avatar
 
Posts: 91
Karma: 108
Join Date: Jan 2008
Device: Palm Treo 680, Sony Reader
Quote:
Originally Posted by user_none View Post
In every case I've seen (I will have to make sure calibre conforms) the cover.png is the first image if there is a cover.png. It should be less processing power than extracting the image from a PMLZ archive.
Yah, I looked into this. Sadly, that's not the case. Most eReader books put the Cover as the first page, but many don't. In most of the Terry Pratchett books, the cover.png is the second file because they have the Publisher image on top.

You'll have to loop until you match "cover.png."

Will you be adding cover metadata support to *.pmlz as I've shown, or do I need to get off my lazy butt and submit through bazarr? Granted, I need to tidy it up a bit and add the "extract_cover" condition.

- Jim
macr0t0r is offline   Reply With Quote
Old 12-04-2009, 06:16 AM   #13
user_none
Sigil & calibre developer
user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.
 
user_none's Avatar
 
Posts: 2,488
Karma: 1063785
Join Date: Jan 2009
Location: Florida, USA
Device: Nook STR
Quote:
Originally Posted by macr0t0r View Post
Will you be adding cover metadata support to *.pmlz as I've shown, or do I need to get off my lazy butt and submit through bazarr? Granted, I need to tidy it up a bit and add the "extract_cover" condition.
I added a support for this the other day. It takes a bit for Kovid to merge the changes from my branch into trunk. It should be there now.
user_none is offline   Reply With Quote
Old 12-05-2009, 01:52 AM   #14
macr0t0r
Connoisseur
macr0t0r doesn't littermacr0t0r doesn't litter
 
macr0t0r's Avatar
 
Posts: 91
Karma: 108
Join Date: Jan 2008
Device: Palm Treo 680, Sony Reader
Man, I keep creating yeterday's news. Thanks a bunch! I'll update bazarr in a couple days and try it out.

- Jim
macr0t0r is offline   Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search

Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Including book cover when adding books Keith0603 Calibre 11 05-23-2011 12:10 AM
Buy Ereader for $100 including shipping 901hw Flea Market 5 08-23-2010 09:00 PM
Problems with PML Conversions - Is it a bug? JulianL Calibre 10 07-18-2010 04:15 PM
Is it okay to import PML directly into Calibre? ficbot Calibre 3 05-02-2010 07:37 PM
Cover images for PML/PDB -- size? Jon Noring Workshop 0 08-13-2008 07:35 PM


All times are GMT -4. The time now is 02:30 AM.


MobileRead.com is a privately owned, operated and funded community.