Solved!! Admittedly, not entirely within the framework of Calibre, but worth documenting for others. Here are the steps that I took:
- Edit the epub inside of Calibre. The CSS needed to be updated to define the page size properly. These were the lines that were added to force a precise page size with no margins:
Code:
@page {
size: 838px 1188px;
margin: 0;
}
After the CSS was updated, I made sure that "Check book" came back clean
- Unzip the epub to a temporary working folder.
- (Optional) Optimize background images. While the CSS defined a page size of 838x1188 pixels, the actual images were much larger. Using Imagemagick on Linux made this trivial:
Code:
cd /tmp/book/OEBPS/images
for i in $(ls -1 page*jpg); do convert $i -resize 838x1188 small_$i; mv small_$i $i; done
- Remove links from table of contents page. Unfortunately, I could not figure out a way to keep the links valid. After converting, they were referring to the absolute paths to the expanded xhtml files on my computer. In this case, a simple sed script cleaned this out:
Code:
cd /tmp/book/OEBPS/xhtml
sed -re 's:<a href=\"spread_[0-9]+.xhtml\">::g;s:</a>::g' -i spread_6.xhtml
- Convert each individual xhtml file to PDF using WeasyPrint:
Code:
cd /tmp/book/OEBPS/xhtml
for htm in $(ls -1 *xhtml); do weasyprint --optimize-images --full-fonts --hinting $htm ${htm}.pdf; done
rename 's/_(\d)\./_0\1./' *pdf
The rename command was necessary to put the pages in the correct order to join them into one PDF. Otherwise, page 5 would have wound up between pages 49 and 50.
- Create a single PDF from the individual pages
Code:
pdfunite $(ls -1 *pdf | xargs) book.pdf
- Create a table of contents for the PDF. This was done using pdftk by dumping the PDF data, writing a list of bookmarks, and then merging it back in:
Code:
pdftk book.pdf dump_data > book.info
Lines like the following were added to the book.info file
Code:
BookmarkBegin
BookmarkTitle: Contents
BookmarkLevel: 1
BookmarkPageNumber: 6
BookmarkBegin
BookmarkTitle: Before You Begin
BookmarkLevel: 2
BookmarkPageNumber: 8
BookmarkBegin
BookmarkTitle: Be Prepared
BookmarkLevel: 1
BookmarkPageNumber: 11
BookmarkBegin
BookmarkTitle: Making Sense of the Screen
BookmarkLevel: 2
BookmarkPageNumber: 12
This was slightly painful, as I couldn't find a good way to automate the generation of these lines. Still, there weren't that many that were needed. Once the bookmarks were written, it was time to merge them back into the PDF:
Code:
pdftk book.pdf update_info book.info output book2.pdf
- Finally, this PDF could be brought back into Calibre by "Add books > Add files to selected book records"