View Single Post
Old 09-26-2009, 10:03 AM   #776
olaf
Enthusiast
olaf is on a distinguished road
 
Posts: 43
Karma: 50
Join Date: May 2009
Device: Kindle3
Figured out the smart-quotes thing with encoding. But now I am trying to determine how to replace actual text that is in error. In several places in the actual RSS feed there is an appearance of 'and #8216;' instead of a single quote. The preprocess_regexps command seems to replace everything between x and y with z - that is the only thing I know to make text replacements with. But I tried the following command to no avail. Is this the right command? Do I have the syntax wrong? I just want to replace the entire string, but do I say replace everything between 'and #8217' and semicolon with "'"? (the latter being a single-quote embedded in double-quotes).

preprocess_regexps = [(re.compile(r'and #8216.?;', re.DOTALL|re.IGNORECASE), lambda match: '"')]


Also - trying to convert '<STRONG>' to '<b>', but doesn't seem to work. using for a command is
preprocess_regexps = [(re.compile(r'<strong.?>', re.DOTALL|re.IGNORECASE), lambda match: '<b>')]


(also doing a similar command for the end tag.) What am I doing wrong?

Last edited by olaf; 09-26-2009 at 11:17 AM.
olaf is offline