View Single Post
Old 07-07-2013, 07:18 PM   #1
Rackamouth
Junior Member
Rackamouth began at the beginning.
 
Posts: 6
Karma: 10
Join Date: Jun 2013
Device: Kindle Touch
postprocess_html receives html string instead of soup

Hi,
I'm developing a recipe without feeds, so I use parse_index. Everything works find, except extra_css disappears somewhere, so no styles on the end product. So instead of specifying my own styles with extra_css (the original html specifies styles based on id), I decided to replace <p id=headline>...</p> by <h1>...</h1>, <p id=quote>...</p> by <blockquote>...</blockquote> and so on. so I do
Code:
def postprocess_html(self, soup, first):
		for div in soup.findAll(id='headline'):
			div.name = 'h1'
		for div in soup.findAll(id='quote'):
			div.name = 'blockquote'
BUT soup.findAll fails because, for some reason I can't fathom, soup isn't a BeautifulSoup object but a plain string. Am I missing something???

Thanks,
TM
Rackamouth is offline   Reply With Quote