I've been wondering for a while about an idea that I know also came up on the Plugin Ideas sticky thread ages ago. It is the idea of being able to import lists of books.
These lists might be things like Hugo nominees from Wikipedia, or someone's ad hoc list of favourite books/top 100/whatever. It could be a "new/upcoming releases" list or ... you get the idea. The intent will be to try to make this as streamlined as possible (once a source has been configured), minimising the clicks and pain involved. My ideal goal is where you are just browsing away, see someone's list of recommended books in a forum post, you paste it into calibre and as quickly as possible see which books you have etc.
Right now with calibre there is no way to do anything like this with a group/list of books.
So what would you actually "do" with the list? Well you might simply want them available as a retrievable list to view. You might want to tag/apply a custom column value like "Hugo" or whatever. You might want to put the books on that list onto a device. These are features the Reading List plugin already has, which is why I think it would make sense to extend this plugin rather than just creating a new one.
There are obviously a lot of technical difficulties/challenges. I want the ability to paste a text based list of books from the clipboard. I also want to be able to point at a web page and import the list from there. It *will* involve the user creating regular expressions for the former, and xpath expressions for the latter, so some technical expertise is required. However I want those sets of expressions to be named/saved to be retrievable from a dropdown, and potentially shareable. So for arguments sake there might be a "Wikipedia Hugo" configuration which works with any Wikipedia pages that have data displayed like
this one. Or maybe it only works with that one page, depends on each usage.
From a UI perspective there would be a series of steps.
Step 1 - the user would specify their source - be it pasting in text from the clipboard, pointing to a URL, or importing from a file.
Step 2 - this is all about converting the source into a list of titles/authors. The user creates or selects an import profile to match that source. So here is where you will be doing your regex or xpath thing, with a view on the left of the raw text and a grid on the right showing the results. A checkbox will let you flip the author LN, FN to match whatever convention you use in your library.
Step 3 - the matching against books in your library. This will have lots of fun to resolve. How titles/authors are named in the list might be different to your library - author initials, suffixes, missing authors, title mismatches etc. Obviously there might be no matches because you simply don't have that book. Or there might be multiple matches, perhaps you have multiple editions or the title is too ambiguous. The user must be able to easily refine each book in the list to find their matching calibre book.
Step 4 - saving your list (if you decide to) by creating/updating a Reading List, and configuring the list as you do now.
For text based lists (like pasting from the clipboard) you will just specify a single regex, a bit like what you do when adding books, just to identify the title/author from it.
For web pages it will be more complex - I will assume that data is presented in a table and the user must specify the xpath expressions to identify the table of interest, the rows, and then the title and author data within. Anyone who has ever had to write a metadata source or Store plugin will be well familiar with the scraping techniques and challenges. It won't work with every website list out there, but hopefully it will cover enough.
There's also loads of scope for what else this feature might ultimately do. Things I haven't mentioned but may need thinking about early on. Exporting/sharing lists with other users. Sharing these configuration profiles - is it done like the Search the Internet plugins approach of some bundled but also exportable individually. What about updatable lists - if I have a "top 50 books of 2012" list from goodreads, this will change every month and I will want to minimise the effort of list updating.
Any thoughts appreciated... in particular what examples of websites/lists would you like to be able to import?