View Single Post
Old 03-17-2015, 10:00 PM   #14
DiapDealer
Grand Sorcerer
DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.
 
DiapDealer's Avatar
 
Posts: 28,708
Karma: 205039118
Join Date: Jan 2010
Device: Nexus 7, Kindle Fire HD
The r''' ''' is probably overkill in this situation, but I've gotten into the habit of using them all the time for regex expressions in python. '[-–—]' or "[-–—]" would achieve the same thing in this particular instance. It's still just a string representation of the regex expression.

Code:
text_str = regex.sub(r'''[-–—]''', ' ', match.group(3))
regex.substitute('everything matching this expression', with 'this', in 'this string')
Find all occurrences of - or – or — and replace them with a space in the string contained in the 3rd matching group. Store the results in text_str.

Code:
text_str = regex.sub(r''' {2,}''', ' ', text_str)
Find all occurrences of two or more consecutive spaces and replace them with a single space in the text_str string. Store the results in text_str.

Code:
return '<{0}{1}>{2}</{0}>'.format(match.group(1), match.group(2), text_str)
String formatting/substitution.
Code:
'Hello {0}'.format('there')
Substitute {0} with 'there'
Code:
'Hello {0} {1} {2}, {0}'.format('there', 'you', 10)
Becomes 'Hello there you 10, there.'

You don't even need to use numbers if you're not going to repeat anything:
Code:
'Hello {} {} {}, {}'.format('there', 'you', 10, 'you')
You could also use string concatenation:
Code:
return match.group(1) + match.group(2) + text_str + match.group(1)
But then you have to worry about making sure everything is represented properly as a string beforehand. Probably not necessary in this case, but again, just a habit I've gotten into to avoid type mismatches (plus I just like it better than the %s %d string substitution method )

Code:
return '<%s%s>%s</%s>' % (match.group(1), match.group(2), text_str, match.group(1))
In this particular case:
Code:
return '<{0}{1}>{2}</{0}>'.format(match.group(1), match.group(2), text_str)
match.group(1) will be the tag name (h1, h2, h3, etc) and gets plugged into both {0}s.
match.group(2) will be any (optional) attributes (class="foo") and gets plugged into {1}.
text_str is our manipulated content from between the h-tags and gets plugged into {2}

Last edited by DiapDealer; 03-17-2015 at 10:11 PM.
DiapDealer is online now   Reply With Quote