Errm, you get the data by extracting the metadata from the files, I'd suggest. I think most of our file backends already have some metadata access items. Metadata typically _is_ part of the epub. Or the PDF. Or the DJVU file.
Of course, one could start another way. However, most existing readers use the metadata embedded into the document files. I'd say that is what users expect. Extracting is a bit I/O intensive, so typically, metadata extracted is cached in a database of some kind, where a relationship to the actual file is stored alongside.
|