MobileRead Forums - View Single Post - Converting between fixed-layout formats (ePub, AZW3, PDF)

tomsem · 10-29-2025, 07:55 PM

As far as I can tell, this is not a 'solved problem' within calibre or with its plugins, or even with public domain tools generally.

KindleUnpack does 'pretty well' with AZW3 to ePub. At least Thorium reader likes the result pretty much all of the time.

Image-only fixed layout (comics usually) is the low hanging fruit, but any deviation in page image size can throw a wrench in things.

But there is PDF (with text objects) to (fixed layout) ePub and AZW3 (with positioned text), which nothing seems to do a good job with ('good job' meaning 'preserves text and positioning in some way').

So what is missing is:

PDF to FL ePub
PDF to FL AZW3
FL ePub to PDF
FL ePub to AZW3

PyMuPDF claims to support conversions between any of its supported formats:

Document formats (input or output): PDF, XPS, ePub, Mobi, FB2, CBZ, SVG, TXT

Image formats:

Input formats: JPG/JPEG, PNG, BMP, GIF, TIFF, PNM, PGM, PBM, PPM, PAM, JXR, JPX/JP2, PSD

Output formats: JPG/JPEG, PNG, PNM, PGM, PBM, PPM, PAM, PSD, PS

It also has OCR support if it finds Tesseract's language support data.

This is example code to convert XPS to PDF:

Code:

import pymupdf

xps = pymupdf.open("input.xps")
pdfbytes = xps.convert_to_pdf()
pdf = pymupdf.open("pdf", pdfbytes)
pdf.save("output.pdf")

(I assume 'mobi' is not same as 'azw3' so even if everything else worked, one would still need to add conversion to AZW3 somehow, maybe by using KindleUnpack code and reversing its workflow to go the other way).

I am wondering if anyone has tried PyMuPDF out for converting between fixed layout formats. I am not holding high expectations for the resulting conversions, but maybe it is in the 'not too bad' category.

Is anyone else interested in this problem?

For ebooks (ePub or Kindle formats), fixed layout content support is not very good. Rarely is there any annotation capability or even text search. So even with best conversion it might not serve any great purpose to have it available.

At any rate I hope to find a little time here and there to try some experiments.

10-29-2025, 07:55 PM	#1
tomsem Grand Sorcerer Posts: 7,028 Karma: 27060353 Join Date: Apr 2009 Location: USA Device: iPhone 15PM, Kindle Scribe, iPad mini 6, PocketBook InkPad Color 3	Converting between fixed-layout formats (ePub, AZW3, PDF) As far as I can tell, this is not a 'solved problem' within calibre or with its plugins, or even with public domain tools generally. KindleUnpack does 'pretty well' with AZW3 to ePub. At least Thorium reader likes the result pretty much all of the time. Image-only fixed layout (comics usually) is the low hanging fruit, but any deviation in page image size can throw a wrench in things. But there is PDF (with text objects) to (fixed layout) ePub and AZW3 (with positioned text), which nothing seems to do a good job with ('good job' meaning 'preserves text and positioning in some way'). So what is missing is: PDF to FL ePub PDF to FL AZW3 FL ePub to PDF FL ePub to AZW3 PyMuPDF claims to support conversions between any of its supported formats: Document formats (input or output): PDF, XPS, ePub, Mobi, FB2, CBZ, SVG, TXT Image formats: Input formats: JPG/JPEG, PNG, BMP, GIF, TIFF, PNM, PGM, PBM, PPM, PAM, JXR, JPX/JP2, PSD Output formats: JPG/JPEG, PNG, PNM, PGM, PBM, PPM, PAM, PSD, PS It also has OCR support if it finds Tesseract's language support data. This is example code to convert XPS to PDF: Code: import pymupdf xps = pymupdf.open("input.xps") pdfbytes = xps.convert_to_pdf() pdf = pymupdf.open("pdf", pdfbytes) pdf.save("output.pdf") (I assume 'mobi' is not same as 'azw3' so even if everything else worked, one would still need to add conversion to AZW3 somehow, maybe by using KindleUnpack code and reversing its workflow to go the other way). I am wondering if anyone has tried PyMuPDF out for converting between fixed layout formats. I am not holding high expectations for the resulting conversions, but maybe it is in the 'not too bad' category. Is anyone else interested in this problem? For ebooks (ePub or Kindle formats), fixed layout content support is not very good. Rarely is there any annotation capability or even text search. So even with best conversion it might not serve any great purpose to have it available. At any rate I hope to find a little time here and there to try some experiments. Last edited by tomsem; 10-29-2025 at 09:09 PM.