MobileRead Forums - View Single Post - Custom recipes (archive, read-only)

Starson17 · 07-02-2010, 06:06 PM

Quote:

Originally Posted by schnortz

I am not having luck with the following...

I enjoy answering these little puzzles, but it's a lot easier if you provide a link to the page that you are having trouble with, and a copy of the recipe you're using.

Here you are asking why this doesn't match something. Usually, that would be impossible without a link to the "something," but I do see an error in this.

Quote:

1. Remove the Additional Information box that comes up after a couple of paragraphs of each article. I have tried

Code:

preprocess_regexps = [
(re.compile(r'<p></p><div*.</div>', re.IGNORECASE | re.DOTALL), lambda match : r''),
]

without success.

I assume you wanted to delete everything in the <div> tag, but you reversed the "everything." it should be ".*" not "*."

Quote:

2. Remove any RSS feeds that start with the word "Photo" or "Photos:"

Any guidance that you can give would be very helpful.

I suspect you want to remove any articles that start with those words, not "feeds" - correct? You control the list of feeds.
For articles, I used to think that filter_regexps would do that job, but I never got it to work. Maybe it only works on recursed links, not the main article link.