PDF and "Exception when scanning for ISBN: not all arguments converted..."
Same error message here, so I played around with the plugin code.
As I found out, this message is just the tip of the iceberg.
To avoid this error message, the line
log.error('Exception when scanning for ISBN:', e)
in extract_threaded() in jobs.py should be changed to
log.error('Exception when scanning for ISBN: {}: {}'.format(type(e).__name__, e))
or
log.error(e.__traceback__)
or similar to avoid the formatting error message.
Next I came across:
Starting job: Extract ISBN for 1 books
================================================== =
Path: E:\Libraries\Literature\Theodor Fontane\Unterm Birnbaum (264)\ Unterm Birnbaum - Theodor Fontane.pdf
---------------------------------------------------
WorkerError: Traceback (most recent call last):
File "calibre\utils\ipc\simple_worker.py", line 300, in main
File "calibre_plugins.extract_isbn.pdf", line 92, in get_isbn
UnboundLocalError: local variable 'scanner' referenced before assignment
Failed to run pdfinfo/pdftohtml
Error in jobs.py:
Exception when scanning for ISBN:
Failed to extract ISBN
================================================== =
Scan complete, with 1 failures
So in pdf.py immediately after def get_isbn(output_dir, pdf_name, log=None): I added the line
scanner = BookScanner(log)
The next step was:
Starting job: Extract ISBN for 1 books
================================================== =
Title: Unterm Birnbaum.
Format: PDF
Path: E:\Bibliotheken\Literatur\Theodor Fontane\Unterm Birnbaum (264)\Unterm Birnbaum - Theodor Fontane.pdf
---------------------------------------------------
pdfinfo returned no UTF-8 data
Scan time: 0.64 secs
The scan failed to find an isbn in 0.64 secs
Failed to extract ISBN
================================================== =
Scan complete, with 1 failures
The error message pdfinfo returned no UTF-8 data comes from get_page_count() in pdf.py
In this method, the line
raw = raw.decode('utf-8')
should be made fault tolerant:
raw = raw.decode('utf-8', error='replace')
With these changes, ISBN extraction for PDF files is now running smoothly for me!
|