Here is what I ended up with so far, just to give an idea.
Code:
def preprocess_html(self, soup):
IMDB_BASE = 'http://www.imdb.com'
truncated_summary = soup.find('p', attrs={'itemprop': ['description']})
link_to_full_summary = truncated_summary.find('a')
if link_to_full_summary is not None:
full_summary_soup = self.index_to_soup(IMDB_BASE + link_to_full_summary['href'])
full_plot_summary = full_summary_soup.find('p', attrs={'class': ['plotSummary']})
truncated_summary.replaceWith(full_plot_summary)
return soup
I will post the whole recipe when done.
EDITED: 11/4 - Today I learned this same full summary is on the same page below the truncated summary and the credits, so I just took this bit of code out of the recipe since it is not needed.