View Single Post
Old 06-22-2025, 06:34 PM   #1
JollyRachele
Junior Member
JollyRachele began at the beginning.
 
Posts: 7
Karma: 10
Join Date: Jun 2025
Device: None
Unhappy Help with converting epub with greek characters to pdf

I was using calibre's awesome CLI to make this script to generate PDF files out of a few dictionaries I needed (unfortunately I need to use the PDF file format to import them in Zotero, a popular cross-platform annotation tool).

Code:
p="$HOME/Downloads/tmp/"
mkdir $p; cd $p
#rm -rf *
list="
https://archive.org/details/anelementarylat01lewigoog/page/58/mode/2up?ref=ol&view=theater
https://archive.org/details/latindictionaryf00andr?ref=ol&view=theater
https://archive.org/details/intermediategree00lidd?ref=ol&view=theater
https://archive.org/details/homericdictionar00auteiala?ref=ol&view=theater
"


for i in $list; do
  echo $i | grep -o -P "https://[^)]*" | grep -o -P "/details/[^?\/]*" | sed -e 's/^\/details\///' | xargs -I {} wget "https://archive.org/download/{}/{}.epub" # | grep -o -P "\(https://.*?\)"
done

for i in ./*.epub; do #w x for i in "*.epub"; do
  ebook-convert $i $i.pdf #--txt-output-formatting plain #--txt-output-encoding utf-32 # --embed-all-fonts
  #pandoc $i -f epub -t pdf -s -o "$i.pdf"
done
You may try this out yourself if you wish: I think it only downloads 3/4 of the epub files, but that's just because one of the epub files is 'temporarily unavailable' on archive.org.

Unfortunately what happens in the end is that the PDF doesn't have proper greek characters, but seems to reproduce the UTF-8 format, part of the XHTML of the starting epub!

It's strange as Calibre's GUI previews everything fine...
Any ideas?

Thank you so much in advance!
JollyRachele is offline   Reply With Quote