View Single Post
Old 07-02-2010, 05:06 PM   #2231
Starson17
Wizard
Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.
 
Posts: 4,004
Karma: 177841
Join Date: Dec 2009
Device: WinMo: IPAQ; Android: HTC HD2, Archos 7o; Java:Gravity T
Quote:
Originally Posted by schnortz View Post
I am not having luck with the following...
I enjoy answering these little puzzles, but it's a lot easier if you provide a link to the page that you are having trouble with, and a copy of the recipe you're using.

Here you are asking why this doesn't match something. Usually, that would be impossible without a link to the "something," but I do see an error in this.
Quote:
1. Remove the Additional Information box that comes up after a couple of paragraphs of each article. I have tried
Code:
preprocess_regexps = [
(re.compile(r'<p></p><div*.</div>', re.IGNORECASE | re.DOTALL), lambda match : r''),
]
without success.

I assume you wanted to delete everything in the <div> tag, but you reversed the "everything." it should be ".*" not "*."

Quote:
2. Remove any RSS feeds that start with the word "Photo" or "Photos:"

Any guidance that you can give would be very helpful.
I suspect you want to remove any articles that start with those words, not "feeds" - correct? You control the list of feeds.
For articles, I used to think that filter_regexps would do that job, but I never got it to work. Maybe it only works on recursed links, not the main article link.
Starson17 is offline