View Single Post
Old 05-18-2013, 06:11 AM   #11
hegi
Enthusiast
hegi began at the beginning.
 
Posts: 44
Karma: 10
Join Date: Dec 2012
Device: Kindle 4 & Kindle PW 3G
Hey Folks,

I seem to be getting nowhere with my limited tries with preprocess_html. The results are strange and I'm having my difficulties to get to grips with the beatiful soup documentation.

Nevertheless, can't I do the trick possibly more easily with preprocess_regexps?

My current status is as follows:

Code:
preprocess_regexps    = [(re.compile(r'(<span class="hcf-location-mark">.+) (</span>)', re.DOTALL|re.IGNORECASE), lambda match: "\1'. '\2")]
But as a result I don't see any change in the output. Could it be, that the braketing of the RegExp Parts and the referencing with \1 or \2 does not work in this case?

I found some useful expamples for preprocess_regexps here, however I havn't found a way documented to include the match form the search in the replace part.

Many thanks in advance for any useful hints in this matter.

Hegi.
hegi is offline   Reply With Quote