Quote:
Originally Posted by marbs
i read a few examples and i think i can write the function it self.
i am not sure i know how to use it. or how to call on it.
but now i am lost. i dont know where i am going with this. can someone focus me again?
|
I don't have much time, so I haven't looked at your function, but normally, it's used this way:
Code:
def preprocess_html(self, soup):
self.append_page(soup, soup.body, 3)
return soup
This takes the article page before it's processed (at the preprocess_html stage) and uses append_page to stick the modified article page into the body of the soup. The "modified page" is the first article page, plus the content of all the subsequent pages obtained by pressing the next page button, which have been tacked onto the bottom of the first page. You will note that append_page is recursive and runs until there are no more next page buttons.
The result will be that the recipe will see a single page article with all the content from all the multiple pages before it begins to process that article.
Does that help?