View Single Post
Old 07-26-2025, 07:12 PM   #6
martinmcu
Junior Member
martinmcu began at the beginning.
 
Posts: 9
Karma: 10
Join Date: Jan 2025
Device: Kindle PW5SE
Solved!! Admittedly, not entirely within the framework of Calibre, but worth documenting for others. Here are the steps that I took:
  1. Edit the epub inside of Calibre. The CSS needed to be updated to define the page size properly. These were the lines that were added to force a precise page size with no margins:
    Code:
    @page {
      size: 838px 1188px;
      margin: 0;
    }
    After the CSS was updated, I made sure that "Check book" came back clean
  2. Unzip the epub to a temporary working folder.
  3. (Optional) Optimize background images. While the CSS defined a page size of 838x1188 pixels, the actual images were much larger. Using Imagemagick on Linux made this trivial:
    Code:
    cd /tmp/book/OEBPS/images
    for i in $(ls -1 page*jpg); do convert $i -resize 838x1188 small_$i; mv small_$i $i; done
  4. Remove links from table of contents page. Unfortunately, I could not figure out a way to keep the links valid. After converting, they were referring to the absolute paths to the expanded xhtml files on my computer. In this case, a simple sed script cleaned this out:
    Code:
    cd /tmp/book/OEBPS/xhtml
    sed -re 's:<a href=\"spread_[0-9]+.xhtml\">::g;s:</a>::g' -i spread_6.xhtml
  5. Convert each individual xhtml file to PDF using WeasyPrint:
    Code:
    cd /tmp/book/OEBPS/xhtml
    for htm in $(ls -1 *xhtml); do weasyprint --optimize-images --full-fonts --hinting $htm ${htm}.pdf; done
    rename 's/_(\d)\./_0\1./' *pdf
    The rename command was necessary to put the pages in the correct order to join them into one PDF. Otherwise, page 5 would have wound up between pages 49 and 50.
  6. Create a single PDF from the individual pages
    Code:
    pdfunite $(ls -1 *pdf | xargs) book.pdf
  7. Create a table of contents for the PDF. This was done using pdftk by dumping the PDF data, writing a list of bookmarks, and then merging it back in:
    Code:
    pdftk book.pdf dump_data > book.info
    Lines like the following were added to the book.info file
    Code:
    BookmarkBegin
    BookmarkTitle: Contents
    BookmarkLevel: 1
    BookmarkPageNumber: 6
    BookmarkBegin
    BookmarkTitle: Before You Begin
    BookmarkLevel: 2
    BookmarkPageNumber: 8
    BookmarkBegin
    BookmarkTitle: Be Prepared
    BookmarkLevel: 1
    BookmarkPageNumber: 11
    BookmarkBegin
    BookmarkTitle: Making Sense of the Screen
    BookmarkLevel: 2
    BookmarkPageNumber: 12
    This was slightly painful, as I couldn't find a good way to automate the generation of these lines. Still, there weren't that many that were needed. Once the bookmarks were written, it was time to merge them back into the PDF:
    Code:
    pdftk book.pdf update_info book.info output book2.pdf
  8. Finally, this PDF could be brought back into Calibre by "Add books > Add files to selected book records"
martinmcu is offline   Reply With Quote