View Single Post
Old 11-02-2011, 10:53 AM   #7
Starson17
Wizard
Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.
 
Posts: 4,004
Karma: 177841
Join Date: Dec 2009
Device: WinMo: IPAQ; Android: HTC HD2, Archos 7o; Java:Gravity T
Quote:
Originally Posted by scissors View Post
Hi Starson.

Thanks for the reply. I thought that was the case - but as the attached image shows, it doesn't always work.
I was responding to your question about whether it should "totally remove a downloaded pages <head> section".
Quote:
Any idea why it ends up in the navbar and not in the article?
I agree with Serpentine - it's probably the use of the formatting element in what's supposed to be quoted text in the meta tag. Something is getting confused as to where the tags start/stop - probably BeautifulSoup. I would expect preprocess_regexps to be able to handle it, but I can't be sure.

I'd definitely print the soup before and after preprocess_regexps to see what's coming in and whether it's getting processed correctly.
Starson17 is offline   Reply With Quote