10-01-2020, 03:58 AM | #391 |
Member
Posts: 12
Karma: 10
Join Date: Apr 2016
Device: Android smartphone
|
Plugin works with epub and pdf converted to epub. But I have very little epub and conversion is not the best option. I have a standard defender. Disabling it did not affect the error.
This is the first time I use the plugin and calibre. Starting job: Extract ISBN for 1 books ================================================== = Title: Hacker and Moore's Essentials of Obstetrics and Gynecology Format: EPUB Path: \\192.168.1.3\NASpace\Recipe\Library\Neville F. Hacker\Hacker and Moore's Essentials of Ob (123)\Hacker and Moore's Essentials o - Neville F. Hacker.epub --------------------------------------------------- Scanning first 10, then last 5, then remaining 68 files Invalid ISBN match: 19103-2899 Valid ISBN13: 9781416059400 Valid ISBN13: 9780808924166 Invalid ISBN match: 215 239 3804 Valid ISBN10: 1865843830 Invalid ISBN match: 1865 853333 Valid ISBN13: 9781416059400 Invalid ISBN match: 22 2008025860 Invalid ISBN match: 9 8 7 6 5 4 3 2 Scan time: 3.90 secs The isbn was found in 3.90 secs Identical ISBN extracted of: 9781416059400 ================================================== = Scan complete, with 0 failures |
10-01-2020, 04:27 AM | #392 |
Grand Sorcerer
Posts: 24,907
Karma: 47303748
Join Date: Jul 2011
Location: Sydney, Australia
Device: Kobo:Touch,Glo, AuraH2O, GloHD,AuraONE, ClaraHD, Libra H2O; tolinoepos
|
OK, that clears up that the plugin works and it isn't a more general problem. The code that does the text extraction from the PDF is fairly old and there may be a better way to do it now. Though I cannot see a reason why it works for me, but, not you.
|
10-01-2020, 04:43 AM | #393 |
Member
Posts: 12
Karma: 10
Join Date: Apr 2016
Device: Android smartphone
|
How else can you extract isbn from pdf?
|
10-01-2020, 07:58 AM | #394 |
Grand Sorcerer
Posts: 24,907
Karma: 47303748
Join Date: Jul 2011
Location: Sydney, Australia
Device: Kobo:Touch,Glo, AuraH2O, GloHD,AuraONE, ClaraHD, Libra H2O; tolinoepos
|
Not that I know of. I'm not sure how many people actually use this plugin. Personally, I download metadata for my books, and that can get the ISBN if one is available. Otherwise, I don't worry about it.
If the metadata download doesn't work, the brute force method would be to convert it to epub, run the plugin and then delete the epubs. That should work, but will take time. I do plan to have a look at it, but, I don't know when. As it is a plugin I don't used, it isn't a high priority for me. If there is someone else who wants to look... |
10-01-2020, 02:24 PM | #395 |
Member
Posts: 12
Karma: 10
Join Date: Apr 2016
Device: Android smartphone
|
Perhaps this will help.
Title: Management Of High-Risk Pregnancy An Evidence-Based Approach Queenan Format: PDF Path: \\192.168.1.3\NASpace\Recipe\Library\2007\Manageme nt Of High-Risk Pregnancy A (544)\Management Of High-Risk Pregnan - 2007.pdf --------------------------------------------------- Traceback (most recent call last): File "site-packages\calibre\utils\ipc\simple_worker.py", line 308, in main File "calibre_plugins.extract_isbn.pdf", line 90, in get_isbn UnboundLocalError: local variable 'scanner' referenced before assignment Failed to extract ISBN |
10-02-2020, 11:15 AM | #396 | |
Resident Curmudgeon
Posts: 73,887
Karma: 128597114
Join Date: Nov 2006
Location: Roslindale, Massachusetts
Device: Kobo Libra 2, Kobo Aura H2O, PRS-650, PRS-T1, nook STR, PW3
|
Quote:
|
|
10-02-2020, 09:08 PM | #397 |
Grand Sorcerer
Posts: 24,907
Karma: 47303748
Join Date: Jul 2011
Location: Sydney, Australia
Device: Kobo:Touch,Glo, AuraH2O, GloHD,AuraONE, ClaraHD, Libra H2O; tolinoepos
|
|
10-03-2020, 05:04 AM | #398 |
Member
Posts: 12
Karma: 10
Join Date: Apr 2016
Device: Android smartphone
|
Title: Epilepsy and Pregnancy - What Every Woman with Epilepsy Should Know Chillemi
Format: PDF Path: C:\Users\magio\Documents\Library\2006\Epilepsy and Pregnancy - What Every (1)\Epilepsy and Pregnancy - What E - 2006.pdf --------------------------------------------------- Exception when scanning for ISBN: not all arguments converted during string formatting Failed to extract ISBN ================================================== = Scan complete, with 1 failures |
10-03-2020, 05:16 AM | #399 | |
Grand Sorcerer
Posts: 24,907
Karma: 47303748
Join Date: Jul 2011
Location: Sydney, Australia
Device: Kobo:Touch,Glo, AuraH2O, GloHD,AuraONE, ClaraHD, Libra H2O; tolinoepos
|
Quote:
|
|
10-03-2020, 05:46 AM | #400 | |
Grand Sorcerer
Posts: 11,733
Karma: 6690881
Join Date: Jan 2010
Location: Notts, England
Device: Kobo Libra 2
|
Quote:
|
|
10-03-2020, 06:31 AM | #401 |
Grand Sorcerer
Posts: 24,907
Karma: 47303748
Join Date: Jul 2011
Location: Sydney, Australia
Device: Kobo:Touch,Glo, AuraH2O, GloHD,AuraONE, ClaraHD, Libra H2O; tolinoepos
|
I haven't been able to get it to fail on any version. That makes it hard to debug. I did follow the code down and I can see there is a difference in how pdftohtml is used compared to how it is used when doing a conversion. But, that might just be coding style or because the desired results are different. I haven't had a chance to study it. Or a lot of desire to.
|
10-09-2020, 10:44 PM | #402 |
Member
Posts: 15
Karma: 10
Join Date: Nov 2018
Location: Thailand
Device: jlalik14@gmail.com
|
extractin ISBN(eBook)
Hi, I would be grateful for an advise regarding extract ISBN. The problem, plugin extract ISBN from ISBN xxxxxxx (from the book).Using this ISBN, metadata downloader cannot find the book. If I manually insert ISBN xxxxxxxxx (ebook) or eISBN from the book, metadata find it with no problem. I have many books where are 2 ISBNs, ISBN and ISBN(ebook). Is there any way to modify the plugin to read ISBN(ebook)number instaed of just the ISBNnumber? THanks in advance.
|
11-20-2020, 09:11 AM | #403 |
Connoisseur
Posts: 51
Karma: 666
Join Date: May 2020
Location: Germany
Device: android smartphone + tablet
|
PDF and "Exception when scanning for ISBN: not all arguments converted..."
Same error message here, so I played around with the plugin code.
As I found out, this message is just the tip of the iceberg. To avoid this error message, the line log.error('Exception when scanning for ISBN:', e) in extract_threaded() in jobs.py should be changed to log.error('Exception when scanning for ISBN: {}: {}'.format(type(e).__name__, e)) or log.error(e.__traceback__) or similar to avoid the formatting error message. Next I came across: Starting job: Extract ISBN for 1 books ================================================== = Path: E:\Libraries\Literature\Theodor Fontane\Unterm Birnbaum (264)\ Unterm Birnbaum - Theodor Fontane.pdf --------------------------------------------------- WorkerError: Traceback (most recent call last): File "calibre\utils\ipc\simple_worker.py", line 300, in main File "calibre_plugins.extract_isbn.pdf", line 92, in get_isbn UnboundLocalError: local variable 'scanner' referenced before assignment Failed to run pdfinfo/pdftohtml Error in jobs.py: Exception when scanning for ISBN: Failed to extract ISBN ================================================== = Scan complete, with 1 failures So in pdf.py immediately after def get_isbn(output_dir, pdf_name, log=None): I added the line scanner = BookScanner(log) The next step was: Starting job: Extract ISBN for 1 books ================================================== = Title: Unterm Birnbaum. Format: PDF Path: E:\Bibliotheken\Literatur\Theodor Fontane\Unterm Birnbaum (264)\Unterm Birnbaum - Theodor Fontane.pdf --------------------------------------------------- pdfinfo returned no UTF-8 data Scan time: 0.64 secs The scan failed to find an isbn in 0.64 secs Failed to extract ISBN ================================================== = Scan complete, with 1 failures The error message pdfinfo returned no UTF-8 data comes from get_page_count() in pdf.py In this method, the line raw = raw.decode('utf-8') should be made fault tolerant: raw = raw.decode('utf-8', error='replace') With these changes, ISBN extraction for PDF files is now running smoothly for me! |
11-21-2020, 10:50 PM | #404 | |
Grand Sorcerer
Posts: 24,907
Karma: 47303748
Join Date: Jul 2011
Location: Sydney, Australia
Device: Kobo:Touch,Glo, AuraH2O, GloHD,AuraONE, ClaraHD, Libra H2O; tolinoepos
|
Quote:
Of course, it could be done. But, the plugin just uses regex to find the possibilities and then validates what was found. Adding more checks would complicate that search. But, it isn't something I am interested in doing. I don't use the plugin, and only made changes to fix it for calibre 5. If anyone wants to add this, please do so. |
|
12-26-2020, 01:34 PM | #405 |
Connoisseur
Posts: 51
Karma: 10
Join Date: Nov 2012
Device: none
|
How about an option to use the last ISBN found instead of the first? ebook ISBNs seem to be listed last pretty frequently and those are the ones I prefer.
Thanks. |
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
Extract ISBN from PDF? | mdroberts | Calibre | 14 | 12-16-2016 07:32 AM |
[Old Thread] Extract ISBN from file name | ChristianQ | Calibre | 59 | 12-09-2015 05:08 AM |
[GUI Plugin] Plugin Updater **Deprecated** | kiwidude | Plugins | 159 | 06-19-2011 12:27 PM |
[Old Thread] Auto Extract ISBN-Feature request | UnraisedArc | Calibre | 60 | 03-23-2011 09:31 AM |
Displaying ISBN column in the main GUI | tilleydog | Library Management | 26 | 02-25-2011 04:08 AM |