View Single Post
Old 09-02-2010, 01:55 PM   #2594
kiklop74
Guru
kiklop74 can program the VCR without an owner's manual.kiklop74 can program the VCR without an owner's manual.kiklop74 can program the VCR without an owner's manual.kiklop74 can program the VCR without an owner's manual.kiklop74 can program the VCR without an owner's manual.kiklop74 can program the VCR without an owner's manual.kiklop74 can program the VCR without an owner's manual.kiklop74 can program the VCR without an owner's manual.kiklop74 can program the VCR without an owner's manual.kiklop74 can program the VCR without an owner's manual.kiklop74 can program the VCR without an owner's manual.
 
kiklop74's Avatar
 
Posts: 800
Karma: 194644
Join Date: Dec 2007
Location: Argentina
Device: Kindle Voyage
Quote:
Originally Posted by TonytheBookworm View Post
Been looking at the AventureGamer code and I have a few questions.
Quote:
Originally Posted by TonytheBookworm View Post
Code:
def preprocess_html(self, soup):
       mtag = '<meta http-equiv="Content-Language" content="en-US"/>\n<meta http-equiv="Content-Type" content="text/html; charset=utf-8"/>'
       soup.head.insert(0,mtag)
what is the reason for inserting the meta tag ?
That was my early experiment with soup, but now it is not needed and I do not put it in new recipes. You can just ignore it.

Quote:
Originally Posted by TonytheBookworm View Post
Code:
       for item in soup.findAll(style=True):
           del item['style']
why is the above used? It appears to remove all instance of style but why is it needed?
This is needed to remove all style codes which usualy specify some text properties. We need as raw text as possible without any styles whatsoever.


Quote:
Code:
       self.append_page(soup, soup.body, 3)
I'm not really clear on this. It appears to me that you are taking the whole soup. appending to the body of the soup with a position of 3?

Code:
       pager = soup.find('div',attrs={'class':'toolbar_fat'})
       if pager:
          pager.extract()
I looked in the code and didn't see why the extraction of this is needed. Because the navigation appears to be inside toolbar_fat_next
This would reaquire a bit longer explanation but to shorten it I'm basically making multipage articles into one. The other code example deletes all div's with class toolbar_fat and I remove because we do not need to see navigation as everything is tied into one uniform article.
kiklop74 is offline