View Single Post
Old 02-14-2009, 12:45 PM   #116
Jellby
frumious Bandersnatch
Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.
 
Jellby's Avatar
 
Posts: 7,563
Karma: 20150435
Join Date: Jan 2008
Location: Spaniard in Sweden
Device: Cybook Orizon, Kobo Aura
Quote:
Originally Posted by nrapallo View Post
A good place to start analyzing is our ebook listing in html or txt format.
After removing the most obvious duplicates from the txt (delete the format, delete the date, delete the "IMP" and "epub" labels, delete any identical consecutive lines), I get some 4540 books (less than 50% of the total), that's still an upper bound, as there are duplicates remaining (where the titles differ in capitalization or punctuation, or where different versions are uploaded in different threads).
Jellby is offline   Reply With Quote