Quote:
Originally Posted by scissors
Hi Starson.
Thanks for the reply. I thought that was the case - but as the attached image shows, it doesn't always work.
|
I was responding to your question about whether it should "totally remove a downloaded pages <head> section".
Quote:
Any idea why it ends up in the navbar and not in the article?
|
I agree with Serpentine - it's probably the use of the formatting element in what's supposed to be quoted text in the meta tag. Something is getting confused as to where the tags start/stop - probably BeautifulSoup. I would expect preprocess_regexps to be able to handle it, but I can't be sure.
I'd definitely print the soup before and after preprocess_regexps to see what's coming in and whether it's getting processed correctly.