Apologies if this thread is a bit long. One of the biggest difficulties I've had getting my head around container processing is the various I/O methods coupled with the importance of keeping all the container variables (parsed_cache, dirtied etc) in-sync with each other.
I'd like to outline 4 scenarios and what I've been doing to date and get some advice on what a better method would be. You may be relieved to know that, none of the methods below have been in plugins released in the wild, so no great damage done
In each case assume we need to do_stuff to an item (name) in a container (con).
do_stuff will be either Batch or Interactive.
- Batch: no need to worry about user interaction with intermediate contents of on-disk exploded files,
e.g. run a specific update on every epub in your library.
- Interactive: on-disk exploded files need to be in-sync with parsed items at all times,
e.g. user is looking at a QWebView preview to assess every interactive change.
do_stuff can also be categorised by function.
- Parser-function: those which can be best achieved using lxml or cssutils on a parsed object.
- Manual-function: those which use ad-hoc methods on raw data, e.g. regex
The 4 scenarios are:
- name is an HTML-type. do_stuff is Parser-function and Batch.
Code:
root = con.parsed(name)
do_stuff(root) # with lxml
con.dirty(name)
- name is an HTML-type. do_stuff is Parser-function and Interactive.
Code:
root = con.parsed(name)
do_stuff(root) # with lxml
con.dirty(name)
con.commit_item(name, keep_parsed) # not entirely sure when keep_parsed=True is recommended
- name is an HTML-type. do_stuff is Manual-function and Batch (I think Interactive is probably the same in this scenario).
Code:
data = con.raw_data(name)
newdata = do_stuff(data) # with regex
abspath = con.name_to_abspath(name)
write newdata to abspath # I've been too nervous to use con.open because of your dire warning comments and my lack of full understanding!
con.parsed_cache.pop(name, False)
con.dirtied.discard(name)
- name is an image (jpg, gif, png). do_stuff is Manual-function and Batch/Interactive
Code:
data = con.raw_data(name) # I think con.parsed(name) is probably the same for an image?
abspath = con.name_to_abspath(name)
img = Image()
img.load(data) or img.open(abspath)
do_stuff(img) # with imagemagick
img.save(abspath)