View Single Post
Old 12-18-2015, 05:03 AM   #5
MiB
Junior Member
MiB began at the beginning.
 
Posts: 8
Karma: 10
Join Date: Dec 2015
Device: iPhone
Quote:
Originally Posted by eschwartz View Post
Normally the regex is applied to the raw contents of the input format, i.e. unzipped EPUB/AZW3 (X)HTML. But PDF is, ah, complicated, so it has to be turned into HTML before you can convert that HTML to something else.
Thanks for these clarifications. In the end I gave up and cut and pasted the html into Sublime Text and RegexRX where I made a more open regex and managed to get rid of all footers. Some of the issues stemmed from the fact that Calibre used different HTML classes for the same footer.

This was quite hard to discover in Calibre. Hopefully I learned something for my next title,
MiB is offline   Reply With Quote