I will need a bit more help, it seems.
This is a clean Python example:
Code:
>>> u = 'http://www.spiegel.de/international/europe/0,1518,773071,00.html'
>>> v = urllib2.urlopen(u).read()
>>> soup = BeautifulSoup(v) # this should be identical to Calibre's Soup
>>> img = soup.find('div', {'class' : 'spGalleryBigPic'}).find('img')
>>> type(img)
<class 'BeautifulSoup.Tag'>
So basically, I want to insert this into the article (below the heading, but for now I just want to get it into the epub article).
So I wrote the following preprocess function:
Code:
def preprocess_html(self, soup):
soup = soup.find('div', {'class' : 'spGalleryBigPic'}).find('img')
return soup
which returns what I called img above.
I get the article using:
Code:
def print_version(self, url):
'from Spigelde.receipt'
rmt = url.rpartition('#')[0]
main, sep, rest = rmt.rpartition(',')
rmain, rsep, rrest = main.rpartition(',')
purl = rmain + ',druck-' + rrest + ',' + rest
return purl
But currently it only works when I do not use preprocess_html.