They're what's left after sorting and cleaning a huge ebook collection, nearly all old formats like txt, html, rar, rft, doc, so have no metadata other than what Calibre tried to glean from the file names.
Try a random text search in Google books or Amazon, you'll be really surprised how accurate they are, there's no need for fancy AI. Here's an example:
https://www.google.com/search?tbo=p&...=10&gws_rd=ssl
The best google books search page is this one:
http://books.google.com/advanced_book_search paste the random text from your unknown book into the 'Exact Phrase' box and search.
For Amazon the best search page is:
http://www.amazon.com/Advanced-Searc...node=241582011 paste your random text (inside quotation marks) into the 'Keywords' box and search.
This same technique can also be used in Bookfinder:
http://www.bookfinder.com/?mode=adva...=¤cy=EUR
Could this not be automated the same as the search for Metadata and Covers?