Register Guidelines E-Books Search Today's Posts Mark Forums Read

Go Back   MobileRead Forums > E-Book Software > Calibre > Plugins

Notices

Reply
 
Thread Tools Search this Thread
Old 04-14-2011, 09:33 PM   #76
drMerry
Addict
drMerry has become one with the cosmosdrMerry has become one with the cosmosdrMerry has become one with the cosmosdrMerry has become one with the cosmosdrMerry has become one with the cosmosdrMerry has become one with the cosmosdrMerry has become one with the cosmosdrMerry has become one with the cosmosdrMerry has become one with the cosmosdrMerry has become one with the cosmosdrMerry has become one with the cosmos
 
drMerry's Avatar
 
Posts: 293
Karma: 21022
Join Date: Mar 2011
Location: NL
Device: Sony PRS-650
I did not find an official file with asin in it in my lib (some handmade ebooks had however)

About the 10 times same number, I did not know it either, but you can try some at:
http://www.isbn-check.com/

Background function seems nice. (is it in build 8840?) I myself would prefer a way you could modify the order in the settings. Sometimes you just want to work in an other way than normally.

Reverse scan for last pages would be great.
If this slows down the process, you could maybe add it as an option in your settings (while it seems you will have to rebuild it all, this could be a good moment if you want to add such option)
drMerry is offline   Reply With Quote
Old 04-14-2011, 10:07 PM   #77
ldolse
Wizard
ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.
 
Posts: 1,337
Karma: 123455
Join Date: Apr 2009
Location: Malaysia
Device: PRS-650, iPhone
I doubt ASIN would be very common, I just checked a few recent Amazon mobis and it's not there, unless Amazon sticks it in some metadata location Calibre doesn't check. I highly doubt it would be in another bookseller's edtion...
ldolse is offline   Reply With Quote
Old 04-14-2011, 10:27 PM   #78
kiwidude
calibre/Sigil Developer
kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.
 
Posts: 4,230
Karma: 1345754
Join Date: Oct 2010
Location: London, UK
Device: Kindle Paperwhite 3G, iPad 3, iPad Air
Thx for the info guys, I will trash the ASIN idea then and leave that up to the metadata download plugins in 0.8

@drMerry - the background metadata download is in the latest source but I would assume it won't be turned on until 0.8 is released. Thx for the ISBN check site. As for the reverse checking, I don't forsee any noticeable overhead for that at all so will just keep it simple as the default behaviour.
kiwidude is offline   Reply With Quote
Old 04-15-2011, 02:28 PM   #79
drMerry
Addict
drMerry has become one with the cosmosdrMerry has become one with the cosmosdrMerry has become one with the cosmosdrMerry has become one with the cosmosdrMerry has become one with the cosmosdrMerry has become one with the cosmosdrMerry has become one with the cosmosdrMerry has become one with the cosmosdrMerry has become one with the cosmosdrMerry has become one with the cosmosdrMerry has become one with the cosmos
 
drMerry's Avatar
 
Posts: 293
Karma: 21022
Join Date: Mar 2011
Location: NL
Device: Sony PRS-650
Another False-Positive ISBN-problem found

At the moment I get a lot more ISBN numbers then at the time there was the ISBN-text test.
I have some new false positive though. But there is a solution for it.
The problem is in the 13-number ISBN.
A 13-ISBN-number needs to start with 978 or 979.
I got some numbers starting with random other numbers. Checksum is all right though.

If there is a check on 978 or 979 start for 13-digit ISBN-numbers, this problem is solved
drMerry is offline   Reply With Quote
Old 04-15-2011, 02:36 PM   #80
kiwidude
calibre/Sigil Developer
kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.
 
Posts: 4,230
Karma: 1345754
Join Date: Oct 2010
Location: London, UK
Device: Kindle Paperwhite 3G, iPad 3, iPad Air
Quote:
Originally Posted by drMerry View Post
Another False-Positive ISBN-problem found...

If there is a check on 978 or 979 start for 13-digit ISBN-numbers, this problem is solved
Ahhh, of course - I had forgotten that permutation falling out of the regex. In fact it makes me wonder if the regex is actually a bit nonsensical in it's current form. I think all this "(9[\-\. ]*7[\-\. ]*[89])" should get ripped out and just replaced with a simple check once we hit a 13-digit number.
kiwidude is offline   Reply With Quote
Old 04-15-2011, 03:44 PM   #81
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 26,463
Karma: 5383257
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
IIRC, there's no requirement that 13 digit ISBN start with any fixed set of numbers. The only reason that most (all?) current ISBN 13s do so is because those two spaces haven't been exhausted as yet.
kovidgoyal is offline   Reply With Quote
Old 04-15-2011, 04:11 PM   #82
theducks
Grand Sorcerer
theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.
 
theducks's Avatar
 
Posts: 15,301
Karma: 6022735
Join Date: Aug 2009
Location: (The original) Silicon Valley, USA
Device: Galaxy Tab 2, Astak Pocket Pro, K4NT
Quote:
Originally Posted by kovidgoyal View Post
IIRC, there's no requirement that 13 digit ISBN start with any fixed set of numbers. The only reason that most (all?) current ISBN 13s do so is because those two spaces haven't been exhausted as yet.
The EAN council controls the numbering (bar code)

977 (ISSN) Periodicals
978 and 979 (Book Lan AKA ISBN-13)

98 and 99 are already assigned to (coupons)

I did not see lower 97[0-6] on the list

It may have been short sighted of them wen this was started about 20 years ago, not to reserve a larger block.
theducks is offline   Reply With Quote
Old 04-15-2011, 04:15 PM   #83
kiwidude
calibre/Sigil Developer
kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.
 
Posts: 4,230
Karma: 1345754
Join Date: Oct 2010
Location: London, UK
Device: Kindle Paperwhite 3G, iPad 3, iPad Air
Quote:
Originally Posted by kovidgoyal View Post
IIRC, there's no requirement that 13 digit ISBN start with any fixed set of numbers. The only reason that most (all?) current ISBN 13s do so is because those two spaces haven't been exhausted as yet.
I've just looked this up here:
http://www.isbn-international.org/faqs/view/5

According to them:
Quote:
Prefix element – currently this can only be either 978 or 979 (it is always 3 digits).
I guess the issue is the definition of "currently"

I guess what I could do is just keep any ISBN-13 number it finds and keep scanning until it finds one with 978/979. If the latter is present, that will get returned.

Or I could throw it in as a configuration option as to whether to accept things other than 978/979. I would rather just be relying on the check_isbn logic in Calibre though for consistency.
kiwidude is offline   Reply With Quote
Old 04-15-2011, 04:52 PM   #84
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 26,463
Karma: 5383257
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
I would suggest preferentially using a 13 digit match that starts with 978/9. calibre's check_isbn13 only checks that the check digit is correct. It makes no assumptions about the first 3 digits.
kovidgoyal is offline   Reply With Quote
Old 04-16-2011, 11:22 AM   #85
drMerry
Addict
drMerry has become one with the cosmosdrMerry has become one with the cosmosdrMerry has become one with the cosmosdrMerry has become one with the cosmosdrMerry has become one with the cosmosdrMerry has become one with the cosmosdrMerry has become one with the cosmosdrMerry has become one with the cosmosdrMerry has become one with the cosmosdrMerry has become one with the cosmosdrMerry has become one with the cosmos
 
drMerry's Avatar
 
Posts: 293
Karma: 21022
Join Date: Mar 2011
Location: NL
Device: Sony PRS-650
Quote:
Originally Posted by kiwidude View Post
I've just looked this up here:
http://www.isbn-international.org/faqs/view/5

According to them:
....
I guess the issue is the definition of "currently"
The problem is currently indeed.
The (c) at the end of the pages states 2009.
I however can not find any more recent document. So it seems this are the only available at the moment.

But it would be a good idea to add an settings option to add this.
By default you could add 978 and 979. If they get more numbers, and the plugin (or calibre) is not developed anymore, you could add them manually
drMerry is offline   Reply With Quote
Old 04-16-2011, 04:36 PM   #86
telemetrics
Junior Member
telemetrics began at the beginning.
 
Posts: 8
Karma: 10
Join Date: Apr 2011
Device: IPad
Lightbulb Extract ISBN - Fantastic Feature. Further Suggestions

I just downloaded Calibre and was just wondering about this feature. Thanks a lot.

Feature 1: OCR
Is it possible to extract first and last 3/4 pages of an eBook and run this on an OpenSource (or Free) OCR.
http://code.google.com/p/tesseract-ocr/

Feature 2: Autorun "Download metadata and covers" for all files where ISBN was found.

Feature 3: Detect ISBN in File Name.
ISBN number in File Names are found in some cases. They may not have a the prefix of the string 'ISBN' but just direct number ISBN10 or 13. However we need to clean the special chars like Underscores and Square Brackets.

Feature 4: ReOrder Suggestion based on Name
Incase multiple ISBN numbers are found then we could show the options and let the user select one (in just one click). The Optional ISBN Numbers can be looked up and the titles and authors can be displayed next to it.
However these should be ordered based on the Distance from the Title of the option to the file name of the ebook.
http://en.wikipedia.org/wiki/Levenshtein_distance
telemetrics is offline   Reply With Quote
Old 04-16-2011, 05:11 PM   #87
kiwidude
calibre/Sigil Developer
kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.
 
Posts: 4,230
Karma: 1345754
Join Date: Oct 2010
Location: London, UK
Device: Kindle Paperwhite 3G, iPad 3, iPad Air
@telemetrics - thx for the suggestions.

Anything related to filename won't work, the filename is the name Calibre has given it, not whatever it might have had originally. If you have books with ISBN in the filename then you can use the file pattern at the time you add the book to pick that up.

I'll have a think about your other points when I get some time before I respond - I've got a lot of other changes for both this plugin and others that I have to get sorted first. Thanks for the suggestions though and perhaps others may have feedback on them.
kiwidude is offline   Reply With Quote
Old 04-16-2011, 09:13 PM   #88
ldolse
Wizard
ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.
 
Posts: 1,337
Karma: 123455
Join Date: Apr 2009
Location: Malaysia
Device: PRS-650, iPhone
Quote:
Originally Posted by telemetrics View Post
I just downloaded Calibre and was just wondering about this feature. Thanks a lot.

Feature 1: OCR
Is it possible to extract first and last 3/4 pages of an eBook and run this on an OpenSource (or Free) OCR.
http://code.google.com/p/tesseract-ocr/

Feature 2: Autorun "Download metadata and covers" for all files where ISBN was found.

Feature 3: Detect ISBN in File Name.
ISBN number in File Names are found in some cases. They may not have a the prefix of the string 'ISBN' but just direct number ISBN10 or 13. However we need to clean the special chars like Underscores and Square Brackets.

Feature 4: ReOrder Suggestion based on Name
Incase multiple ISBN numbers are found then we could show the options and let the user select one (in just one click). The Optional ISBN Numbers can be looked up and the titles and authors can be displayed next to it.
However these should be ordered based on the Distance from the Title of the option to the file name of the ebook.
http://en.wikipedia.org/wiki/Levenshtein_distance
Adding OCR seems like an inordinate amount of work for a very small return just to discover the ISBN number in a small handful of books. I doubt that C code can be included in a plugin, it would generally require integration with Calibre and Calibre's build process, which also requires the OCR project to be set up for reliable cross-platform compilation. Beyond that, as it currently stands the pdf engine can't be trusted to reliably get detect/extract images from an image based pdf. Not sure if the new pdf engine is any better.

Number 2 can be accomplished by typing ISBN:True in the search box after using the plugin, highlighting everything, and clicking ctrl-D.

Number 3 can be done while importing the book as Kiwidude noted. There are a number of threads in the library management subforum, if you're not sure how to go about it I suggest searching/asking there.

While number 4 is something that could be done it seems like a lot of work for again little ROI (and the selections would likely include lots of false positives trying to guess if there is a title in the vicinity of the ISBN) - kiwidude maintains the plugin, so tackling something like that is up to him, but personally I'd rather see him investing his time in the dup detection plugin or one of the other projects.

Last edited by ldolse; 04-16-2011 at 09:58 PM.
ldolse is offline   Reply With Quote
Old 04-19-2011, 07:17 AM   #89
kiwidude
calibre/Sigil Developer
kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.
 
Posts: 4,230
Karma: 1345754
Join Date: Oct 2010
Location: London, UK
Device: Kindle Paperwhite 3G, iPad 3, iPad Air
Ok, so here are my plans for the next version of this plugin:
  • The scan will run as a background job, and popup with a dialog when done, just like the metadata download does in Calibre 0.8
  • I am removing all of the interactive choices/options. These were put in to get around the fact that a scan could take a long time and have made the code much more complex than I like. I want just a single "Extract ISBN" option that runs in the background and scans all formats for the book in preferred input order until it finds one.
  • If it finds any new or updated isbns, then it will "mark" those books and issue a search of "marked:new_isbn". So if you want to do a metadata download on just that book subset you can do so. I might make this a config option to turn off if people don't want their library search changed after isbns are found.
  • I will follow drMerry's suggestion of a configuration option for valid ISBN13 prefixes, defaulting to 978/979. I would rather that than get values that at this point in time we know for sure are not valid ISBNs.
  • For scanning PDFs, the final 5 pages will be scanned in reverse order. For all other formats it will have the same behaviour as now of scanning the whole book from front to back. The latter is something that might be optimised in future but it is a lower priority imho.

Any objections to the above feel free to comment on.
kiwidude is offline   Reply With Quote
Old 04-19-2011, 08:15 AM   #90
theducks
Grand Sorcerer
theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.
 
theducks's Avatar
 
Posts: 15,301
Karma: 6022735
Join Date: Aug 2009
Location: (The original) Silicon Valley, USA
Device: Galaxy Tab 2, Astak Pocket Pro, K4NT
You might want to include ISSN (977) for those that store Periodicals
theducks is offline   Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search

Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
[Old Thread] Extract ISBN from file name ChristianQ Calibre 56 05-20-2012 10:59 AM
[GUI Plugin] Plugin Updater **Deprecated** kiwidude Plugins 159 06-19-2011 01:27 PM
[Old Thread] Auto Extract ISBN-Feature request UnraisedArc Calibre 60 03-23-2011 10:31 AM
Displaying ISBN column in the main GUI tilleydog Library Management 26 02-25-2011 05:08 AM
Extract ISBN from PDF? mdroberts Calibre 10 12-15-2009 02:35 AM


All times are GMT -4. The time now is 10:38 AM.


MobileRead.com is a privately owned, operated and funded community.