View Single Post
Old 11-28-2014, 12:44 AM   #5
ireadtheinternet
Member
ireadtheinternet began at the beginning.
 
Posts: 21
Karma: 10
Join Date: Oct 2014
Device: Android
Thanks as always, Kovid! This helped.

It worked when I changed the first lines of parse_index to
Code:
    def parse_index(self):
        toc_page_raw = self.index_to_soup('http://www.imdb.com/search/title?sort=year,desc&production_status=released&title_type=feature', raw=True)
        toc_page_raw = re.sub(r'<script\b.+?</script>', '', toc_page_raw, flags=re.DOTALL|re.IGNORECASE)
        toc_page = self.index_to_soup(toc_page_raw)
        toc = toc_page.find(name='div', attrs={'id':'main'})  
        ...
Now to merge this with my original..
ireadtheinternet is offline   Reply With Quote