I've got this one reworked and working quite fine except the multipage articles. Problem is calibre (or soup or whatever) doesn't recognize '<a href="?page=2">2</a>' as a valid link, even if it is (a relative one). I redefined is_link_wanted to log every link analyzed and they don't reach the function.
So I guess the solution is to append the current URL in preprocess_html or preprocess_regexps... but I don't know where the current page URL is stored, or if it's accessible.
In other words, I want to replace "?page=2" with "(current URL)?page=2", but don't know how to access "current URL".
Any hints?
|