02-18-2019, 04:04 PM | #1 |
Junior Member
Posts: 8
Karma: 53566
Join Date: Mar 2017
Device: Kindle Touch
|
Preserving the hyphens in ISBN number
Is there any way in Calibre to preserve the hyphens in the ISBN in the Ids field?
I am raising this question after spotting the spotty quality of the Publisher field that the Download Metadata obtains from Amazon, Google Books and OCLC Worldcat. The ISBN has a formal structure 978 (EAN prefix) - <Registration group> (approximates to language) - <Registrant> (Publisher/Imprint) - <Publication element> (Publisher's book/edition Id) - <Check digit> This means that the ISBN encodes an authorative identifier for publishers, not subect to the vagaries of data entry by Amazon, Google Books, or library staff. This would allow identification and correction of Metadata errors, using Calibre's catalogue function to list publisher and ISBN, ordered by ISBN. However, as the elements are variable length, it is far easier to see the Registrant if the hyphens remain in the data. I don't want to add a Custom column, as this would duplicate data. Currently, I am considering exporting a CSV file, and writing an Excel macro to insert hyphens. However this is an ugly fudge, and I wondered whether there was any way to stop Calibre removing the hyphens (and loosing useful information in the process). |
02-18-2019, 06:14 PM | #2 |
Well trained by Cats
Posts: 29,800
Karma: 54830978
Join Date: Aug 2009
Location: The Central Coast of California
Device: Kobo Libra2,Kobo Aura2v1, K4NT(Fixed: New Bat.), Galaxy Tab A
|
The information is still there, you just lost the parse (publisher and book number) implied part. BPH gets 1 or 2 digits of Publisher, while a vanity press may only get 1 digit of book number.
Note ISBN 13 (EAN) does not have the check digit that ISBN10 does (0-9,X) |
Advert | |
|
02-18-2019, 06:52 PM | #3 | |
Wizard
Posts: 2,082
Karma: 8796704
Join Date: Jun 2010
Device: Kobo Clara HD,Hisence Sero 7 Pro RIP, Nook STR, jetbook lite
|
Quote:
bernie |
|
02-19-2019, 01:36 AM | #4 |
creator of calibre
Posts: 43,856
Karma: 22666666
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
There is no information lost by removing hyphens, hyphens are there simply for humans to read, the ISBN number means the same thig with or without hyphens
|
02-19-2019, 04:31 AM | #5 |
The Grand Mouse 高貴的老鼠
Posts: 71,506
Karma: 306214458
Join Date: Jul 2007
Location: Norfolk, England
Device: Kindle Voyage
|
|
Advert | |
|
02-19-2019, 04:34 AM | #6 | |
The Grand Mouse 高貴的老鼠
Posts: 71,506
Karma: 306214458
Join Date: Jul 2007
Location: Norfolk, England
Device: Kindle Voyage
|
Quote:
Note that some big publishers have multiple prefixes, and some publishers will not be consistent in matching prefix to imprint. |
|
02-19-2019, 04:41 AM | #7 | |
Nameless Being
|
Quote:
since 1 January 2007 they now always consist of 13 digits. ISBNs are calculated using a specific mathematical formula and include a check digit to validate the number. (e.a) Each ISBN consists of 5 elements with each section being separated by spaces or hyphens. Three of the five elements may be of varying length: ... Check digit – this is always the final single digit that mathematically validates the rest of the number. It is calculated using a Modulus 10 system with alternate weights of 1 and 3. |
|
02-19-2019, 04:53 AM | #8 |
Well trained by Cats
Posts: 29,800
Karma: 54830978
Join Date: Aug 2009
Location: The Central Coast of California
Device: Kobo Libra2,Kobo Aura2v1, K4NT(Fixed: New Bat.), Galaxy Tab A
|
|
02-19-2019, 04:22 PM | #9 | ||
Junior Member
Posts: 8
Karma: 53566
Join Date: Mar 2017
Device: Kindle Touch
|
Hi, theducks and kovidgoyal
Quote:
Quote:
The file defines the codes (of varying length) that the International ISBN agency assigns to each National ISBN agency. It then describes the ranges that each National ISBN agency will use for allocating Registrant codes. Note that these ranges are allocated by National ISBN agencies according to the structure of their national publishing industry. (Example: Israel (3-digit code) allocates 2-digit registrants in the range 00-19, while neighboring Jordan (4-digit code) allocates 2-digit registrants in the range 10-49. Their ranges for 3-digit and 4-digit registrants also differ, while Jordon has a single 1-digit registrant and Israel has a range of 5-digit registrants.) Yes, the publisher information is contained within the 13 digits of the ISBN, but due to variable length Registration Group and Registrant, it is impossible to obtain without either (1) hyphens separating each part of the ISBN, or (2) code to apply the International ISBN agency XML file. If you go to https://www.isbn-international.org/r...ile_generation, you can get a copy of the XML file (Or a PDF file for humans to read). BTW, just to be clear, ISBN-13 has a check digit, calculated using a Modulus 10 system with alternate weights of 1 and 3. (see https://www.isbn-international.org/content/what-isbn). A different formula from the ISBN-10, but still a check digit. I agree that the information is there, in theory. However, it is not accessible. To offer an analogy, you have a PGP encrypted message, but you do not have the decrypt key. The information is present, but unobtainable. |
||
02-19-2019, 10:25 PM | #10 |
creator of calibre
Posts: 43,856
Karma: 22666666
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
Well, I'm afraid I'm not interested in preserving hyphens in that field, a bit too much work and also I doubt many of the metadata sources preserve the hyphens either.
You could however write a simple script to process the metadata using the calibredb command line tool. It could read the XML file and thereby extract the needed information and either directly correctthe publisher or stick the hyphenated isbn value into a custom column. If you wanted to get really ambitious you could write a calibre plugin to do this as well. |
02-20-2019, 06:05 AM | #11 | ||
Junior Member
Posts: 8
Karma: 53566
Join Date: Mar 2017
Device: Kindle Touch
|
Quote:
However I know one excellent data source that does preserve the hyphens - The Internet Speculative Fiction Database (www.isfdb.org). Although primarily limited in coverage to Science fiction and fantasy, in practice it exends to many thrillers, some detective fiction, some historical fiction. It preserves the formatted ISBN, links variant titles (where the same novel has been printed with different titles), is *much* better at documenting series data/sub-series data than than any other source I have found, and is excellent for pen names and publisher data. I use this to verify Metadata downloads. Quote:
|
||
02-20-2019, 12:12 PM | #12 |
Deviser
Posts: 2,265
Karma: 2090983
Join Date: Aug 2013
Location: Texas
Device: none
|
Since Calibre will automatically strip out the hyphens given any Edit Metadata chance to do so, you could temporarily add the hyphens, then immediately copy part of the ISBN into a custom column using Bulk Metadata Edit Search & Replace.
This SQL will work in any SQLite tool, such as 'SQLite Expert-Personal' (which is free) to add the hyphens before you use BME S&R. Code:
UPDATE identifiers SET val = (SUBSTR(val,1,3)||'-'||SUBSTR(val,4,1)||'-'||SUBSTR(val,5,3)||'-'||SUBSTR(val,8,5)||'-'||SUBSTR(val,13,1) ) WHERE type = 'isbn' AND val NOT LIKE '%-%' /* AND book = 25376 */ /* example output: 978-0-684-84328-5 */ /* These are comments /* DaltonST |
02-20-2019, 02:19 PM | #13 |
Bibliophagist
Posts: 35,393
Karma: 145435140
Join Date: Jul 2010
Location: Vancouver
Device: Kobo Sage, Forma, Clara HD, Lenovo M8 FHD, Paperwhite 4, Tolino epos
|
One issue is that in content.opf, the hyphens from the ISBN seem to be removed in almost all of the ebooks that included that information. I found that I had to check inside the book to see if the ISBN is there, most often on the copyright page. Quite often more than 1 ISBN is present there since for some reason, publishers seem to like both the print and ebook ISBNs to be present.
|
02-20-2019, 02:29 PM | #14 |
null operator (he/him)
Posts: 20,568
Karma: 26954694
Join Date: Mar 2012
Location: Sydney Australia
Device: none
|
@Cynosarges - if you have the hyphenated ISBN values in a spreadsheet, you could wrangle them into a calibre custom column via the Import List plugin. The plugin can match rows in a CSV table to books in a library.
BR |
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
Remove hyphens from ISBN in bulk? | Frizzell | Library Management | 2 | 11-08-2017 08:05 PM |
Djvu: Extracting ISBN numbers from a large number of books? | MelBr | Other formats | 7 | 04-13-2014 03:35 AM |
.mobi to PDF preserving page number metadata | msteuernagel | Conversion | 0 | 05-07-2012 11:56 AM |
Stupid Question: ISBN-10 and ISBN-13 | Tegan | Library Management | 4 | 03-11-2011 01:20 AM |
ISBN number question | Brad Chambers | Writers' Corner | 3 | 01-25-2011 06:06 PM |