Quote:
Originally Posted by kovidgoyal
You can check that by passing raw=True to index_to_soup then it will return the raw html it got from the server. Save that to a file and examine it.
|
Hmm, I checked this:
Code:
soup = self.index_to_soup(self.FRONTPAGE, raw = True)
and I don't get the full page, which is consistent with what I was seeing. However, when I turn off javascript with Firefox, I still get the full page at the URL in the browser.
What else can I do to try to download the enitre page?