MobileRead Forums - View Single Post

tonyx3 · 05-27-2010, 04:07 AM

I see. So calibre uses two different pdf-to-html engines?

The one used in the conversion pipeline is obviously returning different results from the one used in the regex wizard.

Quote:

I prefer to spend the time just improving the PDF engine so it removes headers and footers automatically.

That would be amazing.

Unfortunately, I've never once had the defaults work on removing headers or footers from PDF's. I've always had to write my own regex. And on multiple occasions I've had them match perfectly in the preview, and then not get removed in the conversion. (which is one reason I wish the preview html matched the conversion html)

I'm sure PDF conversion, given the format's nature, must be one of the bigger headaches in developing the conversion system.