Ebook tagging service (idea phase)
I wanted to get the communities feedback on an idea and maybe get several suggestions before going about determining how to set it up and look for volunteers.
I have a very large ebook collection and am having a hard time tagging every book. I find that sometimes the authors name has part of the series name and it will not download correctly. To manually fix them all would take hundreds of hours (that is a conservative estimate)
I wanted the equivalent of freedb (music) for ebooks.
Here was the idea.
We build a webservice application that can do a lookup based on a sentence in the book. We compute the hashcode for that sentence and store it in a database with the link to the associated metadata for that book. Since we dont want to store every sentence in the book in the database we will look for common things like
Chapter 1, Part 1. or other keyword If we can not find those maybe just take sentences over 10 characters long for the first 5-10 pages?
To populate this database we would have to build a plugin and get volunteer to run it on their collections. For books that contain an isbn and have a cover, description and tag we check if it is in the database if not we add their data to our database. Very quickly we probably could get hundreds of thousands of books in a database.
Would also like to find out if there is a way to setup my own data
in the isbn field. For books like short stories where there is no isbn if someone manually tags it we would like to share it.
Interface
addBook(sentence, cover, metadata)
mergeBook(sentence, cover, metadata)
used for merging two sets of meta data. Makes sure that everything is populated.
containsBook(filename)
containsBook(sentence)
lookupBook(sentence, filename)
First has anyone tried anything like this yet. Seems that the content of the book is the only truly unique way to associated your copy of the book with mine as the same book if we can't both find the isbn.
Would people be willing to help populate this service by running it on their collections? Any developers interested in helping. I am more of a java/c# guy and would probably be more suited for the backend but would figure out how to write some python if necessary.
Feedback appreciated.
Last edited by jorm; 03-13-2012 at 10:18 AM.
|