View Single Post
Old 10-28-2015, 12:19 PM   #1
jackie_w
Grand Sorcerer
jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.
 
Posts: 6,252
Karma: 16544692
Join Date: Sep 2009
Location: UK
Device: ClaraHD, Forma, Libra2, Clara2E, LibraCol, PBTouchHD3
Container methods, various scenarios

Apologies if this thread is a bit long. One of the biggest difficulties I've had getting my head around container processing is the various I/O methods coupled with the importance of keeping all the container variables (parsed_cache, dirtied etc) in-sync with each other.

I'd like to outline 4 scenarios and what I've been doing to date and get some advice on what a better method would be. You may be relieved to know that, none of the methods below have been in plugins released in the wild, so no great damage done

In each case assume we need to do_stuff to an item (name) in a container (con).

do_stuff will be either Batch or Interactive.
  • Batch: no need to worry about user interaction with intermediate contents of on-disk exploded files,
    e.g. run a specific update on every epub in your library.
  • Interactive: on-disk exploded files need to be in-sync with parsed items at all times,
    e.g. user is looking at a QWebView preview to assess every interactive change.

do_stuff can also be categorised by function.
  • Parser-function: those which can be best achieved using lxml or cssutils on a parsed object.
  • Manual-function: those which use ad-hoc methods on raw data, e.g. regex

The 4 scenarios are:
  1. name is an HTML-type. do_stuff is Parser-function and Batch.
    Code:
    root = con.parsed(name)
    do_stuff(root) # with lxml
    con.dirty(name)
  2. name is an HTML-type. do_stuff is Parser-function and Interactive.

    Code:
    root = con.parsed(name)
    do_stuff(root) # with lxml
    con.dirty(name)
    con.commit_item(name, keep_parsed) # not entirely sure when keep_parsed=True is recommended
  3. name is an HTML-type. do_stuff is Manual-function and Batch (I think Interactive is probably the same in this scenario).
    Code:
    data = con.raw_data(name)
    newdata = do_stuff(data) # with regex
    abspath = con.name_to_abspath(name)
    write newdata to abspath # I've been too nervous to use con.open because of your dire warning comments and my lack of full understanding!
    con.parsed_cache.pop(name, False)
    con.dirtied.discard(name)
  4. name is an image (jpg, gif, png). do_stuff is Manual-function and Batch/Interactive

    Code:
    data = con.raw_data(name)  # I think con.parsed(name) is probably the same for an image?
    abspath = con.name_to_abspath(name)
    img = Image()
    img.load(data) or img.open(abspath)
    do_stuff(img) # with imagemagick
    img.save(abspath)
jackie_w is offline   Reply With Quote