06-16-2012, 02:07 AM | #241 |
calibre/Sigil Developer
Posts: 4,601
Karma: 2092290
Join Date: Oct 2010
Location: Australia
Device: Kindle Oasis
|
The biggest use I personally make of it is where I am importing books where the title and author fields are not set, such as by the book having a random filename. Rather than manually typing it in you could extract isbn and do a metadata download with the option to overwrite title and author.
Metadata downloads with an isbn will all but guarantee you a better likelihood of the right metadata from the website, since most metadata plugins will lookup by isbn if available and fallback to title and author search if not. The latter being more error prone due to spelling errors, typos, series info in title field etc. And a small minority undoubtedly use it because they are sufficiently fussy to want the isbn field to contain the value for thei specific edition of that book. |
07-28-2012, 07:09 PM | #242 |
Enthusiast
Posts: 26
Karma: 10
Join Date: Oct 2011
Device: galaxy tab
|
group extract
hello kiwidude,
i'm wondering if its possible to add some sort of group limit to extraction? i tried selecting all books but it will take to long to finish, the group limit will group the queues into 10, 20 or 50 (depends on how powerful the computer). e.g.
something like that, is that possible? it will prevent calibre from hanging. thanks |
07-31-2012, 05:28 AM | #243 |
calibre/Sigil Developer
Posts: 4,601
Karma: 2092290
Join Date: Oct 2010
Location: Australia
Device: Kindle Oasis
|
Beta for next version
@stanmarsh - give this version a whirl. By default the batch size is 100, but you can increase/reduce it in Preferences -> Plugins -> Extract ISBN -> Configure plugin.
Note that there are a couple of side effects if the number of books you have selected is more than your batch size causing multiple jobs to be run:
Last edited by kiwidude; 08-01-2012 at 05:14 AM. Reason: Remove attachment as officially released |
08-01-2012, 05:15 AM | #244 |
calibre/Sigil Developer
Posts: 4,601
Karma: 2092290
Join Date: Oct 2010
Location: Australia
Device: Kindle Oasis
|
v1.4.3 Released
Changes in this release:
|
08-05-2012, 10:01 PM | #245 |
Enthusiast
Posts: 26
Karma: 10
Join Date: Oct 2011
Device: galaxy tab
|
hello kiwidude!
thanks for implementing the feature request! will test it out! |
09-26-2012, 07:34 PM | #246 |
Junior Member
Posts: 4
Karma: 10
Join Date: Jan 2012
Device: prs-500, Ipad 2
|
Fantastic...thank you
|
10-05-2012, 05:20 AM | #247 |
Member
Posts: 11
Karma: 10
Join Date: Oct 2012
Device: Sony PRS-T1
|
Extract ISBN is really great at extracting ISBNs from the books text. But this made it stumble.
From "The Definitive Guide to How Computers Do Math: Featuring the Virtual Diy Calculator" page 2: Code:
For general information on our other products and services please contact our Customer Care Department within the U.S. at 877-762-2974, outside the U.S. at 317-572-3993 or fax 317-572-4002. Wiley also publishes its books in a variety of electronic formats. Some content that appears in print, however, may not be available in electronic format. Library of Congress Cataloging-in-Publication Data is available. ISBN-13 978-0471-73278-5 ISBN-10 0-471-73278-8 Code:
Invalid ISBN match: 877-762-2974 Valid ISBN10: 3175723993 Invalid ISBN match: 317-572-4002 Invalid ISBN match: -13 978-0471-73278 Invalid ISBN match: -10 0-471-73278-8 |
10-05-2012, 09:51 AM | #248 | |
Well trained by Cats
Posts: 29,689
Karma: 54369090
Join Date: Aug 2009
Location: The Central Coast of California
Device: Kobo Libra2,Kobo Aura2v1, K4NT(Fixed: New Bat.), Galaxy Tab A
|
Quote:
once found (10 character ISBN 10), the check digit should validate (the NANP phone number should fail in near 100% of the cases the FAX number is one of those edge cases ) |
|
10-05-2012, 05:45 PM | #249 |
Member
Posts: 11
Karma: 10
Join Date: Oct 2012
Device: Sony PRS-T1
|
Well, yes and no. Had the publisher decided to use spaces instead of dashes, your suggestion would still find the number 13 978 0471 73278 5 which wouldn't be valid without parsing all substrings of 13 digits length.
|
10-05-2012, 06:25 PM | #250 | |
Well trained by Cats
Posts: 29,689
Karma: 54369090
Join Date: Aug 2009
Location: The Central Coast of California
Device: Kobo Libra2,Kobo Aura2v1, K4NT(Fixed: New Bat.), Galaxy Tab A
|
Quote:
ISBN and ISBN13 are more normal (ISBN 10 is redundant. ISBN is 10 chars) |
|
10-06-2012, 01:43 PM | #251 |
Member
Posts: 11
Karma: 10
Join Date: Oct 2012
Device: Sony PRS-T1
|
The blank or dash before the number isn't relevant. The regexp will (should) start matching at the first digit.
You are right in saying it is redundant. That doesn't mean it won't be used by publishers. |
10-06-2012, 02:29 PM | #252 | |
Well trained by Cats
Posts: 29,689
Karma: 54369090
Join Date: Aug 2009
Location: The Central Coast of California
Device: Kobo Libra2,Kobo Aura2v1, K4NT(Fixed: New Bat.), Galaxy Tab A
|
Quote:
The pattern should attempt to match starting at the first digit. \d+\-\d+\-\d+\-(\d|X) is one pattern, the first digit pair followed by a space should not be included in the match because it violates this pattern (dash separator) Now if they had used spaces every place, you are correct, that the pattern should have started with the 10 (because we don't know where the line ends, we can')t tell that there were more digits than the pattern could capture if it had not started with the wrong number group Because of all the various ways of printing the ISBN, lots of post capture validating needs to be done (The FAX number managed to MOD11 validate (a fairly rare case) ). |
|
10-07-2012, 07:07 AM | #253 | |
Member
Posts: 11
Karma: 10
Join Date: Oct 2012
Device: Sony PRS-T1
|
Quote:
The pattern you suggested will only match if the ISBN contains exactly three dashes. I'd implement something along the lines of \d(\d| |-)+(\d|X) and then validating all substrings consisting of 10 and 13 digits. But I fail to see why we are discussing about the best way to implement this. I could see a point in a discussion with the maintainer of the plugin. But that would be kiwidude. It's his decision how he writes his plugin. I just wanted to point out a case in which the current implementation fails. |
|
10-09-2012, 11:29 AM | #254 |
Connoisseur
Posts: 87
Karma: 1234
Join Date: Sep 2012
Device: Onyx Boox M92
|
I have noticed that during the exectution of the Extract ISBN plugin certain attempts take much more time than the others (the execution seems to hang to a particular % and the CPU time raises up), thus slowing down the whole search process.
Thereafter, a part of the files processed invariably fails to return any ISBN and one has to manually extract them anyway. Therefore I wonder if the developer could work out some adjustable timeout in the GUI to limit the time wasted towards a single, high-probability failed search. |
10-10-2012, 05:09 AM | #255 |
calibre/Sigil Developer
Posts: 4,601
Karma: 2092290
Join Date: Oct 2010
Location: Australia
Device: Kindle Oasis
|
@RotAnal - the majority of any computation involved in terms of "slowdown" is converting the book to a format it can extract the ISBN from. If your book is an EPUB (or indeed a PDF) then the performance will be as good as it is going to get, for any other format it must convert to EPUB behind the scenes which is where the lag is involved.
The actual time take searching for ISBN's is miniscule by comparison, and it is already optimised to only search a small proportion of pages at the front and back of the book rather than the whole thing. |
Thread Tools | Search this Thread |
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
Extract ISBN from PDF? | mdroberts | Calibre | 14 | 12-16-2016 07:32 AM |
[Old Thread] Extract ISBN from file name | ChristianQ | Calibre | 59 | 12-09-2015 05:08 AM |
[GUI Plugin] Plugin Updater **Deprecated** | kiwidude | Plugins | 159 | 06-19-2011 12:27 PM |
[Old Thread] Auto Extract ISBN-Feature request | UnraisedArc | Calibre | 60 | 03-23-2011 09:31 AM |
Displaying ISBN column in the main GUI | tilleydog | Library Management | 26 | 02-25-2011 04:08 AM |