I saw you caught the class "supported-by" that added some cruft to articles. Thanks! There's just one more class at the moment that I've found, "accessibility-ad-header visually-hidden", that also adds cruft to each article (typically the word "Advertisement" to each article, around the fifth paragraph). I used this line in my own recipe, which does the job:
Quote:
classes('accessibility-ad-header visually-hidden'),
|
A side question -- I'm still curious about replacing hyperlinks in the recipe with plain old text, a change which I gather is not desirable for the upstream recipe in Calibre. I've been trying to implement it in the version of the recipe that I run, and from what I've seen on this forum and in other recipes, this bit of code is instrumental in removing hyperlinks:
Quote:
def preprocess_html(self, soup):
for alink in soup.findAll('a'):
if alink.string is not None:
tstr = alink.string
alink.replaceWith(tstr)
return soup
|
I've tried it everywhere in the NYT recipe, but wherever I put it, it seems to have no effect. Can anyone tell me if there's something I'm missing -- is there a place I'm supposed to put it, or an extra bit of code that it hooks into, or something special about the structure of this recipe that requires something different? Thanks.