View Single Post
Old Yesterday, 09:40 AM   #1
fhanzlik
Junior Member
fhanzlik began at the beginning.
 
Posts: 3
Karma: 10
Join Date: Dec 2025
Device: Calibre on Linux PC
Python exceptions when PDF to TXT conversion

When I tried convert PDF document to text, no output was created and I got error:
Code:
$ ebook-convert V2510.pdf V2510_ebook.txt --enable-heuristics
Conversion options changed from defaults:
  enable_heuristics: True
1% Converting input to HTML...
InputFormatPlugin: PDF Input running
on /home/OTHER/data/dos/diskd/UctoFH_doklady/faDosle.fh/t-mobile/Vyuctovani_55052935_2510.pdf
pdftohtml log:
Page-1
Page-2
Page-3
Traceback (most recent call last):
  File "/usr/bin/ebook-convert", line 21, in <module>
    sys.exit(main())
             ~~~~^^
  File "/usr/lib64/calibre/calibre/ebooks/conversion/cli.py", line 429, in main
    plumber.run()
    ~~~~~~~~~~~^^
  File "/usr/lib64/calibre/calibre/ebooks/conversion/plumber.py", line 1089, in run
    self.oeb = self.input_plugin(stream, self.opts,
               ~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^
                                self.input_fmt, self.log,
                                ^^^^^^^^^^^^^^^^^^^^^^^^^
                                accelerators, tdir)
                                ^^^^^^^^^^^^^^^^^^^
  File "/usr/lib64/calibre/calibre/customize/conversion.py", line 242, in __call__
    ret = self.convert(stream, options, file_ext,
                       log, accelerators)
  File "/usr/lib64/calibre/calibre/ebooks/conversion/plugins/pdf_input.py", line 66, in convert
    PDFDocument(xml, self.opts, self.log)
    ~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib64/calibre/calibre/ebooks/pdf/reflow.py", line 1476, in __init__
    self.find_header_footer()
    ~~~~~~~~~~~~~~~~~~~~~~~^^
  File "/usr/lib64/calibre/calibre/ebooks/pdf/reflow.py", line 1877, in find_header_footer
    if self.pages[head_page].texts \
       ~~~~~~~~~~^^^^^^^^^^^
IndexError: list index out of range
with calibre-8.0.1-5.fc42.x86_64 on Fedora 42 x86_64 Linux.
Same result is without --enable-heuristics option.

Calibre was installed with dependencies:
calibre-8.0.1-5.fc42.x86_64
optipng-7.9.1-1.fc42.x86_64
podofo-0.10.5-1.fc42.x86_64
python3-lxml-html-clean-0.4.2-1.fc42.noarch
python3-pyqt6-webengine-6.9.0-0.1.fc42.x86_64
python3-xxhash-3.6.0-1.fc42.x86_64
qt6-qtimageformats-6.9.3-1.fc42.x86_64
qt6-qtwebview-6.9.3-1.fc42.x86_64
libwebp-tools-1.5.0-2.fc42.x86_64
mathjax3-3.2.2-7.fc42.noarch
python3-apsw-3.47.2.0-2.fc42.x86_64
python3-css-parser-1.0.10-3.fc42.noarch
python3-html2text-2024.2.26-5.fc42.noarch
python3-html5-parser-0.4.12-5.fc42.x86_64
python3-mechanize-0.4.10-4.fc42.noarch
python3-pychm-0.8.6-16.fc42.x86_64
python3-regex-2024.11.6-1.fc42.x86_64
chmlib-0.40-45.fc42.x86_64

Know someone where problem could be?
(I'm quite new to Calibre/ebook-convert)
Thanks, Franta Hanzlik
fhanzlik is offline   Reply With Quote