Register Guidelines E-Books Search Today's Posts Mark Forums Read

Go Back   MobileRead Forums > E-Book Software > Calibre > Development

Notices

Reply
 
Thread Tools Search this Thread
Old 10-28-2015, 12:19 PM   #1
jackie_w
Grand Sorcerer
jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.
 
Posts: 6,216
Karma: 16534894
Join Date: Sep 2009
Location: UK
Device: Kobo: KA1, ClaraHD, Forma, Libra2, Clara2E. PocketBook: TouchHD3
Container methods, various scenarios

Apologies if this thread is a bit long. One of the biggest difficulties I've had getting my head around container processing is the various I/O methods coupled with the importance of keeping all the container variables (parsed_cache, dirtied etc) in-sync with each other.

I'd like to outline 4 scenarios and what I've been doing to date and get some advice on what a better method would be. You may be relieved to know that, none of the methods below have been in plugins released in the wild, so no great damage done

In each case assume we need to do_stuff to an item (name) in a container (con).

do_stuff will be either Batch or Interactive.
  • Batch: no need to worry about user interaction with intermediate contents of on-disk exploded files,
    e.g. run a specific update on every epub in your library.
  • Interactive: on-disk exploded files need to be in-sync with parsed items at all times,
    e.g. user is looking at a QWebView preview to assess every interactive change.

do_stuff can also be categorised by function.
  • Parser-function: those which can be best achieved using lxml or cssutils on a parsed object.
  • Manual-function: those which use ad-hoc methods on raw data, e.g. regex

The 4 scenarios are:
  1. name is an HTML-type. do_stuff is Parser-function and Batch.
    Code:
    root = con.parsed(name)
    do_stuff(root) # with lxml
    con.dirty(name)
  2. name is an HTML-type. do_stuff is Parser-function and Interactive.

    Code:
    root = con.parsed(name)
    do_stuff(root) # with lxml
    con.dirty(name)
    con.commit_item(name, keep_parsed) # not entirely sure when keep_parsed=True is recommended
  3. name is an HTML-type. do_stuff is Manual-function and Batch (I think Interactive is probably the same in this scenario).
    Code:
    data = con.raw_data(name)
    newdata = do_stuff(data) # with regex
    abspath = con.name_to_abspath(name)
    write newdata to abspath # I've been too nervous to use con.open because of your dire warning comments and my lack of full understanding!
    con.parsed_cache.pop(name, False)
    con.dirtied.discard(name)
  4. name is an image (jpg, gif, png). do_stuff is Manual-function and Batch/Interactive

    Code:
    data = con.raw_data(name)  # I think con.parsed(name) is probably the same for an image?
    abspath = con.name_to_abspath(name)
    img = Image()
    img.load(data) or img.open(abspath)
    do_stuff(img) # with imagemagick
    img.save(abspath)
jackie_w is offline   Reply With Quote
Old 10-28-2015, 01:04 PM   #2
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 44,397
Karma: 23798586
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
The idea is that you always use container provided methods to do file I/O. If you want to work on a parsed representation, use container.parsed() and after you are done, use container.dirty(). If you want to work with bytes use

container.raw_data() and to write container.open(name, 'wb'). These methods automatically take care of dirtied items and the parsed cache.

Never write to filesystem files directly. As for previewing in QWebView, you need to virtualize access to the files via QNetworkManager, see how it is done in the editor in the live preview panel (preview.py)

The only thing to keep in mind is that you must never mix the two modes of working -- parsed vs. raw. Finish working in one mode, then start working in the other.
kovidgoyal is offline   Reply With Quote
Advert
Old 10-28-2015, 01:38 PM   #3
jackie_w
Grand Sorcerer
jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.
 
Posts: 6,216
Karma: 16534894
Join Date: Sep 2009
Location: UK
Device: Kobo: KA1, ClaraHD, Forma, Libra2, Clara2E. PocketBook: TouchHD3
Quote:
Originally Posted by kovidgoyal View Post
As for previewing in QWebView, you need to virtualize access to the files via QNetworkManager, see how it is done in the editor in the live preview panel (preview.py)
Looks like more homework required, then
jackie_w is offline   Reply With Quote
Old 10-28-2015, 03:55 PM   #4
jackie_w
Grand Sorcerer
jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.
 
Posts: 6,216
Karma: 16534894
Join Date: Sep 2009
Location: UK
Device: Kobo: KA1, ClaraHD, Forma, Libra2, Clara2E. PocketBook: TouchHD3
I've had a brief look at preview.py. That's way beyond my current skill level, so I won't be pursuing that any time soon.

For the EbookScramble exercise, all I had in mind was to provide
  1. simple selection of a single source file, within calibre via plugin or standalone drag-drop
  2. ability to change scramble rules - standalone version only, plugin will have fixed MR-agreed rules.
  3. Scramble Now button - all scrambling work done by container parser methods, no user interaction
  4. after scrambling (but before final commit), access to a few FYI dialogs, i.e. no user changes.
    e.g. a dialog which shows text by container name, before vs. after, side-by-side. So exploded files will need to be in-sync with the parsed items. The FYI dialogs are mainly those I wished I'd had before I started testing. As the work is already done may as well tidy it up and make it available - at least in the standalone version.
  5. The only user change possible by this point is destination directory for scrambled copy. Basic testing done in calibre plugin version to not allow destination to be a calibre library directory.
  6. Save & exit - where final container.commit(path_to_scrambled_book) is run.

My gut-feel is that there shouldn't be anything unsafe about the above to warrant needing full Preview-mode compliance. If you strongly beg to differ then I'll need to reconsider what options are included.

Re: item 4. I've done this by creating a container of the source book then cloning it to an unedited backup of the Original. I did have some problems with the clone images which didn't exist if I just created a completely independent 2nd container to hold the Original. However, I think it's working now. Maybe a clone isn't exactly what I thought it was. For epub/kepub an independent 2nd container would be acceptable but for an azw3 doing all that packing/unpacking twice is less desirable.
jackie_w is offline   Reply With Quote
Old 10-28-2015, 10:38 PM   #5
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 44,397
Karma: 23798586
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
Use clone_container() to create clones. But remember that if you use clone_container, you must always use raw_data() and container.open()
kovidgoyal is offline   Reply With Quote
Advert
Old 10-29-2015, 10:56 AM   #6
jackie_w
Grand Sorcerer
jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.
 
Posts: 6,216
Karma: 16534894
Join Date: Sep 2009
Location: UK
Device: Kobo: KA1, ClaraHD, Forma, Libra2, Clara2E. PocketBook: TouchHD3
Quote:
Originally Posted by kovidgoyal View Post
Use clone_container() to create clones. But remember that if you use clone_container, you must always use raw_data() and container.open()
Yes, clone_container() is what I used.

Looking at the very few places where I actually access the clone contents, it is all read-only and I think using clone_dir() would be sufficient for my needs. That should remove all risk of unintended consequences.

ETA: ... or maybe not. Back to plan A of using the clone_container totally read-only. At least that was working when I left it last night.

Last edited by jackie_w; 10-29-2015 at 12:35 PM. Reason: ETA
jackie_w is offline   Reply With Quote
Old 10-30-2015, 09:25 AM   #7
jackie_w
Grand Sorcerer
jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.
 
Posts: 6,216
Karma: 16534894
Join Date: Sep 2009
Location: UK
Device: Kobo: KA1, ClaraHD, Forma, Libra2, Clara2E. PocketBook: TouchHD3
I'd like to be able to set the potential EbookScramble plugin variable to minimum_calibre_version = (1, 48, 0)

Is 1.48 OK for all the container class stuff or do I need something much later?

TIA

Last edited by jackie_w; 10-30-2015 at 09:29 AM. Reason: got the version number wrong
jackie_w is offline   Reply With Quote
Old 10-30-2015, 10:28 AM   #8
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 44,397
Karma: 23798586
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
The container class certainly existed in 1.48, but it is likely missing a few bits of functionality and a few bug fixes. Whether any of that impacts your use case is hard to say. I suggest trying it and see, if somebody using it runs into issues in 1.48, you can always bump up the version in a subsequent release.
kovidgoyal is offline   Reply With Quote
Old 10-31-2015, 08:47 PM   #9
jackie_w
Grand Sorcerer
jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.
 
Posts: 6,216
Karma: 16534894
Join Date: Sep 2009
Location: UK
Device: Kobo: KA1, ClaraHD, Forma, Libra2, Clara2E. PocketBook: TouchHD3
Question re: container.href_to_name(href, name)

If I run a query something like this to find links which point somewhere within the book rather than to an external location
Code:
for name in list_of_html_names:
    for a in root.xpath('//*[local-name()="a" and @href]'):
        ahref = a.get('href')
        linkto_name = container.href_to_name(ahref, name)
        if linkto_name is not None:
            do_stuff ...
I would have expected container.href_to_name('#someid', name) to return 'current name' rather than 'None'. Am I using the wrong function or maybe the right function but incorrectly?

For the moment I've just used this instead:
Code:
linkto_name = name if ahref.startswith('#') else container.href_to_name(ahref, name)

Last edited by jackie_w; 10-31-2015 at 09:00 PM.
jackie_w is offline   Reply With Quote
Old 10-31-2015, 10:52 PM   #10
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 44,397
Karma: 23798586
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
That's correct, you have to special case hrefs that are only fragments (start with #)
kovidgoyal is offline   Reply With Quote
Old 11-01-2015, 06:49 AM   #11
jackie_w
Grand Sorcerer
jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.
 
Posts: 6,216
Karma: 16534894
Join Date: Sep 2009
Location: UK
Device: Kobo: KA1, ClaraHD, Forma, Libra2, Clara2E. PocketBook: TouchHD3
Quote:
Originally Posted by kovidgoyal View Post
That's correct, you have to special case hrefs that are only fragments (start with #)
Hah! Got it right by accident, then. Thanks



Digressing to SVG images, now... Any tips on how I might be able to scramble or nullify svg images?

The calibre cover type of usage (<svg>...<image xlink:href="..."></svg) is handled OK already but I'm not sure how to handle
  1. name.svg images
  2. name.(x)html pages full of svg drawing commands

I do have a book with an image of type 2. It looks like
Code:
<svg:svg xmlns:xlink="http://www.w3.org/1999/xlink" enable-background="new 0 0 507 681.177" height="96%" id="Penguin_20BW" version="1.1" viewBox="0 0 507 681.177" width="100%" x="0px" xml:space="preserve" y="0px">
    <svg:path d="..." fill="..."/>
    <svg:path d="..." fill="..."/>
    etc etc
</svg:svg>
I nullified this with
Code:
for ele in root.xpath('//*[local-name()="svg"]/*[local-name()="path"]'):
    ele.attrib.clear()
but a sample of one doesn't give me any confidence that this is sufficient.
jackie_w is offline   Reply With Quote
Old 11-01-2015, 09:30 AM   #12
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 44,397
Karma: 23798586
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
The only safe way is to remove all the elements inside the svg tags. SVG is a whoee tag set that is, IIRC, larger than HTML, so trying to scramble individual tags is not going to work very well.
kovidgoyal is offline   Reply With Quote
Old 11-08-2015, 02:05 PM   #13
jackie_w
Grand Sorcerer
jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.
 
Posts: 6,216
Karma: 16534894
Join Date: Sep 2009
Location: UK
Device: Kobo: KA1, ClaraHD, Forma, Libra2, Clara2E. PocketBook: TouchHD3
Another question re: simple editing of an image in a container

So far, I've been using:

Code:
data = container.raw_data(imgname)
pixmap = QPixmap()
pixmap.loadFromData(data)

... do some pixmap changes via GUI ...

pixmap.save(container.name_to_abspath(imgname))
but pixmap.save breaks your container guidelines of 'don't do direct read/writes'. What should I be using instead to do the write?

TIA
jackie_w is offline   Reply With Quote
Old 11-08-2015, 03:07 PM   #14
jackie_w
Grand Sorcerer
jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.
 
Posts: 6,216
Karma: 16534894
Join Date: Sep 2009
Location: UK
Device: Kobo: KA1, ClaraHD, Forma, Libra2, Clara2E. PocketBook: TouchHD3
Actually, digging around the internet I found this. It looks a bit convoluted but it seems to work - at least for my limited needs. Is there anything neater?
Code:
data = container.raw_data(imgname)
pixmap = QPixmap()
pixmap.loadFromData(data)

... do some pixmap changes via GUI ...

pixmap.save(container.name_to_abspath(imgname))
fmt = imgname.rpartition('.')[-1].upper()
byte_array = QByteArray()
buffer = QBuffer(byte_array)
buffer.open(QIODevice.WriteOnly)
if pixmap.save(buffer, fmt):
    with container.open(imgname, 'wb') as f:
        f.write(buffer.data())
jackie_w is offline   Reply With Quote
Old 11-08-2015, 09:08 PM   #15
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 44,397
Karma: 23798586
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
Use the pixmap_to_data()

from calibre.gui2 import pixmap_to_data
kovidgoyal is offline   Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search

Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Help with Container rename & replace_links methods jackie_w Development 2 10-22-2015 08:00 AM
CC V3.4.2 crash scenarios chaley Calibre Companion 0 08-10-2014 03:45 AM
META-INF/container error afpeter ePub 9 07-23-2013 01:04 AM
Do you buy books based on worst-case scenarios? Marseille General Discussions 8 06-08-2011 10:21 PM
best way to generate TOCs: 4 scenarios? hapax legomenon Workshop 6 11-03-2008 06:21 PM


All times are GMT -4. The time now is 07:34 AM.


MobileRead.com is a privately owned, operated and funded community.