Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Software > Calibre > Plugins

Notices

Reply
 
Thread Tools Search this Thread
Old 03-16-2018, 04:16 PM   #331
theducks
Well trained by Cats
theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.
 
theducks's Avatar
 
Posts: 29,792
Karma: 54830978
Join Date: Aug 2009
Location: The Central Coast of California
Device: Kobo Libra2,Kobo Aura2v1, K4NT(Fixed: New Bat.), Galaxy Tab A
Quote:
Originally Posted by aquiaolado View Post
Hi.
Can someone explain me why the plugin does not retrieve any ISBN?
Thanks.

This is the result

Starting job: Extract ISBN for 1 books
================================================== =
Title: (Lecture Notes in Social Networks) James A. Dator, John A. Sweeney, Aubrey M. Yee (auth.)-Mutative Media Communication Technologies and Power Relations in the Past, Present, and Futures-Springer Inte
Format: PDF
Path: C:\Users\Paulo Martins\Documents\Biblioteca do Calibre\Desconhecido\(Lecture Notes in Social Networks) (2354)\(Lecture Notes in Social Networ - Desconhecido.pdf
---------------------------------------------------
Failed to extract ISBN
================================================== =
Scan complete, with 1 failures
Can you SEE an ISBN in the first few pages? This PI looks for (commonly found) ISBN like patterns.
theducks is offline   Reply With Quote
Old 03-16-2018, 04:55 PM   #332
BetterRed
null operator (he/him)
BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.
 
Posts: 20,565
Karma: 26954694
Join Date: Mar 2012
Location: Sydney Australia
Device: none
From the first post in this thread:

Quote:
This plugin can be used to try to find the ISBN for a book using the text within a book format [file].
That is, from within the PDF, EPUB etc.

BR
BetterRed is offline   Reply With Quote
Old 03-16-2018, 05:20 PM   #333
aquiaolado
Member
aquiaolado began at the beginning.
 
Posts: 14
Karma: 10
Join Date: Mar 2018
Device: smartphone
Hi again.
Thanks for your answers.
It is not only one PDF file. It happens in about 1000 files I have. I can see the the ISBN in the first 10 pages.
aquiaolado is offline   Reply With Quote
Old 03-16-2018, 05:24 PM   #334
aquiaolado
Member
aquiaolado began at the beginning.
 
Posts: 14
Karma: 10
Join Date: Mar 2018
Device: smartphone
format of isbn

The format is, as an example: ISBN-13: 978-1-84520-132-6
aquiaolado is offline   Reply With Quote
Old 03-16-2018, 06:38 PM   #335
BetterRed
null operator (he/him)
BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.
 
Posts: 20,565
Karma: 26954694
Join Date: Mar 2012
Location: Sydney Australia
Device: none
Quote:
Originally Posted by aquiaolado View Post
Hi again.
Thanks for your answers.
It is not only one PDF file. It happens in about 1000 files I have. I can see the the ISBN in the first 10 pages.
Can you see the ISBN as a string of copyable characters, or as characters within an image - AFAIK the plugin doesn't do OCR on PDF's created from images.

BR
BetterRed is offline   Reply With Quote
Old 03-16-2018, 06:41 PM   #336
aquiaolado
Member
aquiaolado began at the beginning.
 
Posts: 14
Karma: 10
Join Date: Mar 2018
Device: smartphone
Yes. It is a normal/cpyable PDF.
aquiaolado is offline   Reply With Quote
Old 03-16-2018, 07:06 PM   #337
BetterRed
null operator (he/him)
BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.
 
Posts: 20,565
Karma: 26954694
Join Date: Mar 2012
Location: Sydney Australia
Device: none
Try convert one of the PDFs to TXT and run the plugin against the TXT version, probably best to isolate the TXT format into a different book.

If the PI can find the ISBN in the TXT version then there must be something in the PDF that is effectively hiding it. Are the PDF's 'protected' in any way?

BR
BetterRed is offline   Reply With Quote
Old 03-16-2018, 07:59 PM   #338
theducks
Well trained by Cats
theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.
 
theducks's Avatar
 
Posts: 29,792
Karma: 54830978
Join Date: Aug 2009
Location: The Central Coast of California
Device: Kobo Libra2,Kobo Aura2v1, K4NT(Fixed: New Bat.), Galaxy Tab A
I've never see ISBN13 quite written with dashed like that.
That is mixing the old Language-Publisher-Book number-check digit representation from the print only days, with the 978 barcode series of ISBN
theducks is offline   Reply With Quote
Old 03-18-2018, 12:25 PM   #339
Alvgon
Junior Member
Alvgon began at the beginning.
 
Posts: 1
Karma: 10
Join Date: Mar 2018
Device: none
I currently have a PDF with this "copyable" text:

ISBN 0 7506 4790 6

It is not detected as ISBN number, not 10 nor 13 digits format. I supse is because the text string does not have the right length for scan.py/_evaluate_isbn_match
function to detect it.
I'm not a python programmer though.
Could any knowledgeable fellow to comment on possible solutions?
Alvgon is offline   Reply With Quote
Old 03-20-2018, 04:10 AM   #340
Divingduck
Wizard
Divingduck ought to be getting tired of karma fortunes by now.Divingduck ought to be getting tired of karma fortunes by now.Divingduck ought to be getting tired of karma fortunes by now.Divingduck ought to be getting tired of karma fortunes by now.Divingduck ought to be getting tired of karma fortunes by now.Divingduck ought to be getting tired of karma fortunes by now.Divingduck ought to be getting tired of karma fortunes by now.Divingduck ought to be getting tired of karma fortunes by now.Divingduck ought to be getting tired of karma fortunes by now.Divingduck ought to be getting tired of karma fortunes by now.Divingduck ought to be getting tired of karma fortunes by now.
 
Posts: 1,161
Karma: 1404241
Join Date: Nov 2010
Location: Germany
Device: Sony PRS-650
Because it isn't a valid ISBN-10 declaration.

Valid is ISBN:0750647906 and ISBN:0-7506-4790-6 not ISBN 0 7506 4790 6

https://en.wikipedia.org/wiki/Intern...rd_Book_Number
Divingduck is offline   Reply With Quote
Old 03-20-2018, 07:59 AM   #341
Nicolas F
Groupie
Nicolas F once ate a cherry pie in a record 7 seconds.Nicolas F once ate a cherry pie in a record 7 seconds.Nicolas F once ate a cherry pie in a record 7 seconds.Nicolas F once ate a cherry pie in a record 7 seconds.Nicolas F once ate a cherry pie in a record 7 seconds.Nicolas F once ate a cherry pie in a record 7 seconds.Nicolas F once ate a cherry pie in a record 7 seconds.Nicolas F once ate a cherry pie in a record 7 seconds.Nicolas F once ate a cherry pie in a record 7 seconds.Nicolas F once ate a cherry pie in a record 7 seconds.Nicolas F once ate a cherry pie in a record 7 seconds.
 
Posts: 161
Karma: 1842
Join Date: Jan 2016
Device: Kobo Glo HD
Quote:
Originally Posted by Alvgon View Post
I currently have a PDF with this "copyable" text:

ISBN 0 7506 4790 6

It is not detected as ISBN number, not 10 nor 13 digits format. I supse is because the text string does not have the right length for scan.py/_evaluate_isbn_match
function to detect it.
I'm not a python programmer though.
Could any knowledgeable fellow to comment on possible solutions?
Quote:
Originally Posted by Divingduck View Post
Because it isn't a valid ISBN-10 declaration.

Valid is ISBN:0750647906 and ISBN:0-7506-4790-6 not ISBN 0 7506 4790 6

https://en.wikipedia.org/wiki/Intern...rd_Book_Number
It may be an invalid way to write it down, but that's not really a problem here. The regex used by the plugin will recognize it, so the problem is elsewhere.
(just try to add "ISBN 0 7506 4790 6" anywhere in an epub and the plugin as no problem detecting it)

The plugin probably have difficulty accessing the text of the pdf.
Nicolas F is offline   Reply With Quote
Old 03-20-2018, 10:31 AM   #342
theducks
Well trained by Cats
theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.
 
theducks's Avatar
 
Posts: 29,792
Karma: 54830978
Join Date: Aug 2009
Location: The Central Coast of California
Device: Kobo Libra2,Kobo Aura2v1, K4NT(Fixed: New Bat.), Galaxy Tab A
Quote:
Originally Posted by Nicolas F View Post
It may be an invalid way to write it down, but that's not really a problem here. The regex used by the plugin will recognize it, so the problem is elsewhere.
(just try to add "ISBN 0 7506 4790 6" anywhere in an epub and the plugin as no problem detecting it)

The plugin probably have difficulty accessing the text of the pdf.
Those 'spaces' may be something else, not recognized by the REGEX \s

The problem is the publisher did something unusual
theducks is offline   Reply With Quote
Old 04-02-2018, 03:17 PM   #343
BeckyEbook
Guru
BeckyEbook ought to be getting tired of karma fortunes by now.BeckyEbook ought to be getting tired of karma fortunes by now.BeckyEbook ought to be getting tired of karma fortunes by now.BeckyEbook ought to be getting tired of karma fortunes by now.BeckyEbook ought to be getting tired of karma fortunes by now.BeckyEbook ought to be getting tired of karma fortunes by now.BeckyEbook ought to be getting tired of karma fortunes by now.BeckyEbook ought to be getting tired of karma fortunes by now.BeckyEbook ought to be getting tired of karma fortunes by now.BeckyEbook ought to be getting tired of karma fortunes by now.BeckyEbook ought to be getting tired of karma fortunes by now.
 
BeckyEbook's Avatar
 
Posts: 692
Karma: 2180740
Join Date: Jan 2017
Location: Poland
Device: Misc
Quote:
Originally Posted by Alvgon View Post
I currently have a PDF with this "copyable" text:

ISBN 0 7506 4790 6
As mentioned by @theducks - there is no ordinary "space" between the numbers.

Change in file scan.py line 15 to:

Code:
RE_ISBN = re.compile(u'\s*([0-9\-\.–*―—\^ \u2000\u2001\u2002\u2003\u2004\u2005\u2006\u2007\u2008\u2009\u200A]{9,18}[0-9xX])', re.UNICODE)
And try again.
BeckyEbook is offline   Reply With Quote
Old 03-04-2019, 07:18 AM   #344
excaliber
Connoisseur
excaliber began at the beginning.
 
excaliber's Avatar
 
Posts: 59
Karma: 10
Join Date: Nov 2013
Device: Samsung Galaxy Tab 2 10.1 P5110
@kiwidude: Thanks for the plugin!
I have one issue with it. For every job that is finished a dialog box like this appears:
Scan complete
Extract ISBN found x new isbn(s). Proceed with updating your library?


If the jobs are few then it's not a problem, I can click Yes and all it's ok. The problem arises when there are some hundred jobs and I have to click every time the Yes button - then it's becoming annoying.
Would it be possible to implement a "Yes to all" and maybe "No to all" or "Cancel"?
excaliber is offline   Reply With Quote
Old 10-12-2019, 11:19 AM   #345
JVarga
Junior Member
JVarga began at the beginning.
 
Posts: 2
Karma: 10
Join Date: Oct 2019
Device: none
Extract ISBN does not work under Windows 10

This is a rather old thread, but I hope somebody is still reading it...
I keep failing to extract the ISBN from PDF files (all PDF-s) with the error message:

Traceback (most recent call last):
File "site-packages\calibre\utils\ipc\simple_worker.py", line 290, in main
File "calibre_plugins.extract_isbn.pdf", line 86, in get_isbn
UnboundLocalError: local variable 'scanner' referenced before assignment

It is quite an old error; at present, I use Calibre 4.1.0 under Windows 10.

Can anybody help how to solve or bypass the problem?
JVarga is offline   Reply With Quote
Reply


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Extract ISBN from PDF? mdroberts Calibre 14 12-16-2016 07:32 AM
[Old Thread] Extract ISBN from file name ChristianQ Calibre 59 12-09-2015 05:08 AM
[GUI Plugin] Plugin Updater **Deprecated** kiwidude Plugins 159 06-19-2011 12:27 PM
[Old Thread] Auto Extract ISBN-Feature request UnraisedArc Calibre 60 03-23-2011 09:31 AM
Displaying ISBN column in the main GUI tilleydog Library Management 26 02-25-2011 04:08 AM


All times are GMT -4. The time now is 10:30 AM.


MobileRead.com is a privately owned, operated and funded community.