MobileRead Forums - View Single Post

ireadtheinternet · 10-31-2014, 05:37 PM

Here is what I ended up with so far, just to give an idea.

Code:

    def preprocess_html(self, soup):
        IMDB_BASE = 'http://www.imdb.com'
        
        truncated_summary = soup.find('p', attrs={'itemprop': ['description']})
        link_to_full_summary = truncated_summary.find('a')
        if link_to_full_summary is not None:
            full_summary_soup = self.index_to_soup(IMDB_BASE + link_to_full_summary['href'])
            full_plot_summary = full_summary_soup.find('p', attrs={'class': ['plotSummary']})
            truncated_summary.replaceWith(full_plot_summary)
            
        return soup

I will post the whole recipe when done.

EDITED: 11/4 - Today I learned this same full summary is on the same page below the truncated summary and the credits, so I just took this bit of code out of the recipe since it is not needed.

10-31-2014, 05:37 PM	#4
ireadtheinternet Member Posts: 21 Karma: 10 Join Date: Oct 2014 Device: Android	Here is what I ended up with so far, just to give an idea. Code: def preprocess_html(self, soup): IMDB_BASE = 'http://www.imdb.com' truncated_summary = soup.find('p', attrs={'itemprop': ['description']}) link_to_full_summary = truncated_summary.find('a') if link_to_full_summary is not None: full_summary_soup = self.index_to_soup(IMDB_BASE + link_to_full_summary['href']) full_plot_summary = full_summary_soup.find('p', attrs={'class': ['plotSummary']}) truncated_summary.replaceWith(full_plot_summary) return soup I will post the whole recipe when done. EDITED: 11/4 - Today I learned this same full summary is on the same page below the truncated summary and the credits, so I just took this bit of code out of the recipe since it is not needed. Last edited by ireadtheinternet; 11-04-2014 at 01:03 AM.