Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Software > Calibre > Conversion

Notices

Reply
 
Thread Tools Search this Thread
Old Yesterday, 09:40 AM   #1
fhanzlik
Junior Member
fhanzlik began at the beginning.
 
Posts: 3
Karma: 10
Join Date: Dec 2025
Device: Calibre on Linux PC
Python exceptions when PDF to TXT conversion

When I tried convert PDF document to text, no output was created and I got error:
Code:
$ ebook-convert V2510.pdf V2510_ebook.txt --enable-heuristics
Conversion options changed from defaults:
  enable_heuristics: True
1% Converting input to HTML...
InputFormatPlugin: PDF Input running
on /home/OTHER/data/dos/diskd/UctoFH_doklady/faDosle.fh/t-mobile/Vyuctovani_55052935_2510.pdf
pdftohtml log:
Page-1
Page-2
Page-3
Traceback (most recent call last):
  File "/usr/bin/ebook-convert", line 21, in <module>
    sys.exit(main())
             ~~~~^^
  File "/usr/lib64/calibre/calibre/ebooks/conversion/cli.py", line 429, in main
    plumber.run()
    ~~~~~~~~~~~^^
  File "/usr/lib64/calibre/calibre/ebooks/conversion/plumber.py", line 1089, in run
    self.oeb = self.input_plugin(stream, self.opts,
               ~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^
                                self.input_fmt, self.log,
                                ^^^^^^^^^^^^^^^^^^^^^^^^^
                                accelerators, tdir)
                                ^^^^^^^^^^^^^^^^^^^
  File "/usr/lib64/calibre/calibre/customize/conversion.py", line 242, in __call__
    ret = self.convert(stream, options, file_ext,
                       log, accelerators)
  File "/usr/lib64/calibre/calibre/ebooks/conversion/plugins/pdf_input.py", line 66, in convert
    PDFDocument(xml, self.opts, self.log)
    ~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib64/calibre/calibre/ebooks/pdf/reflow.py", line 1476, in __init__
    self.find_header_footer()
    ~~~~~~~~~~~~~~~~~~~~~~~^^
  File "/usr/lib64/calibre/calibre/ebooks/pdf/reflow.py", line 1877, in find_header_footer
    if self.pages[head_page].texts \
       ~~~~~~~~~~^^^^^^^^^^^
IndexError: list index out of range
with calibre-8.0.1-5.fc42.x86_64 on Fedora 42 x86_64 Linux.
Same result is without --enable-heuristics option.

Calibre was installed with dependencies:
calibre-8.0.1-5.fc42.x86_64
optipng-7.9.1-1.fc42.x86_64
podofo-0.10.5-1.fc42.x86_64
python3-lxml-html-clean-0.4.2-1.fc42.noarch
python3-pyqt6-webengine-6.9.0-0.1.fc42.x86_64
python3-xxhash-3.6.0-1.fc42.x86_64
qt6-qtimageformats-6.9.3-1.fc42.x86_64
qt6-qtwebview-6.9.3-1.fc42.x86_64
libwebp-tools-1.5.0-2.fc42.x86_64
mathjax3-3.2.2-7.fc42.noarch
python3-apsw-3.47.2.0-2.fc42.x86_64
python3-css-parser-1.0.10-3.fc42.noarch
python3-html2text-2024.2.26-5.fc42.noarch
python3-html5-parser-0.4.12-5.fc42.x86_64
python3-mechanize-0.4.10-4.fc42.noarch
python3-pychm-0.8.6-16.fc42.x86_64
python3-regex-2024.11.6-1.fc42.x86_64
chmlib-0.40-45.fc42.x86_64

Know someone where problem could be?
(I'm quite new to Calibre/ebook-convert)
Thanks, Franta Hanzlik
fhanzlik is offline   Reply With Quote
Old Yesterday, 10:29 AM   #2
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 45,690
Karma: 28549304
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
You are using a very old and unsupported version of calibre. Uninstall it, and install the official binary from https://calibre-ebook.com/download_linux if the error still occurs with that, then follow the instructions in: https://www.mobileread.com/forums/sh...d.php?t=186697
kovidgoyal is offline   Reply With Quote
Advert
Old Yesterday, 04:47 PM   #3
fhanzlik
Junior Member
fhanzlik began at the beginning.
 
Posts: 3
Karma: 10
Join Date: Dec 2025
Device: Calibre on Linux PC
Hello Kovid, thanks for Your help!
I think I solved my problem by using pdftotext in the meantime. But I installed the latest version of calibre as you recommended. Unfortunately, the error still seems to occur (and the output is not produced):

Code:
$  ebook-convert 2041.pdf 2041.txt
1% Converting input to HTML...
InputFormatPlugin: PDF Input running
on /home/OTHER/tmp/2041.pdf
pdftohtml log:
Page-1
Page-2
Traceback (most recent call last):
  File "runpy.py", line 198, in _run_module_as_main
  File "runpy.py", line 88, in _run_code
  File "site.py", line 47, in <module>
  File "site.py", line 43, in main
  File "calibre/ebooks/conversion/cli.py", line 427, in main
  File "calibre/ebooks/conversion/plumber.py", line 1088, in run
  File "calibre/customize/conversion.py", line 242, in __call__
  File "calibre/ebooks/conversion/plugins/pdf_input.py", line 66, in convert
  File "calibre/ebooks/pdf/reflow.py", line 1474, in __init__
  File "calibre/ebooks/pdf/reflow.py", line 1885, in find_header_footer
IndexError: list index out of range
But I also found a file where the conversion went well (they are all accounting documents - invoices, from one organization).
If it would help, I can send you a PDF file where the error appears via e-mail (it is about 108kB in size).

$ ebook-convert --version
ebook-convert (calibre 8.16.2)
Created by: Kovid Goyal <kovid@kovidgoyal.net>
fhanzlik is offline   Reply With Quote
Old Yesterday, 10:08 PM   #4
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 45,690
Karma: 28549304
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
Yes, I will need the PDF to be able to help you further.
kovidgoyal is offline   Reply With Quote
Old Yesterday, 10:49 PM   #5
fhanzlik
Junior Member
fhanzlik began at the beginning.
 
Posts: 3
Karma: 10
Join Date: Dec 2025
Device: Calibre on Linux PC
Problematic file 2041.pdf should be uploaded.
Again, thank for Your effort!
Fr. Hanzlik
Attached Files
File Type: pdf 2041.pdf (106.3 KB, 2 views)
fhanzlik is offline   Reply With Quote
Advert
Old Today, 12:45 AM   #6
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 45,690
Karma: 28549304
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
https://github.com/kovidgoyal/calibr...769a37e1bea083
kovidgoyal is offline   Reply With Quote
Reply


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Conversion from pdf to txt yields (near) empty file jchwenger Conversion 3 07-07-2024 11:37 AM
"Python function terminated unexpectedly" on ePub to PDF conversion zunga Conversion 10 03-17-2013 08:18 PM
python based pdf conversion tools KevinH Conversion 1 01-23-2011 12:39 PM
PDF to TXT conversion alkr Calibre 0 10-02-2009 05:34 AM
conversion - pdf to txt? fishcube Sony Reader 1 10-24-2007 03:02 PM


All times are GMT -4. The time now is 05:35 PM.


MobileRead.com is a privately owned, operated and funded community.