View Single Post
Old 03-13-2012, 01:18 PM   #10
Backi
Connoisseur
Backi has become a pillar of the MobileRead communityBacki has become a pillar of the MobileRead communityBacki has become a pillar of the MobileRead communityBacki has become a pillar of the MobileRead communityBacki has become a pillar of the MobileRead communityBacki has become a pillar of the MobileRead communityBacki has become a pillar of the MobileRead communityBacki has become a pillar of the MobileRead communityBacki has become a pillar of the MobileRead communityBacki has become a pillar of the MobileRead communityBacki has become a pillar of the MobileRead community
 
Backi's Avatar
 
Posts: 99
Karma: 15776
Join Date: Dec 2011
Device: PB912 Matt White
Quote:
Originally Posted by jorm View Post
Seems that the content of the book is the only truly unique way to associated your copy of the book with mine as the same book if we can't both find the isbn.
The problem is, that you can't always map contents to the container (here a book).

Speaking mathematically the mapping of containers to contents is a surjective function and is generally not reversible, i.e. the container/book is not always distinct:
With the sentences approach you could identify a closed unit (story, romance, poem), but not what container (book/anthology/collection) it is in, as the same story can be contained in more than one book.

To identify a container one have to consider the hash values of all items in it (that's how the hash of e.g. Java's List is computed). The problem is: How can you split a container's content into it's elements? Perhaps there would be always a blank page as separator between the items, but maybe not always. Also you can't know a priori, if it is a collection of different stories or a collection of chapters belonging to the same story. I think, it would be better to process somehow the TOC.

There could also be "foreign content" in a book, like quotes or proverbs. So taking a sentence might lead you to a different book identified.

Last edited by Backi; 03-13-2012 at 01:21 PM.
Backi is offline   Reply With Quote