MobileRead Forums - View Single Post - Grabbing and including image from another url

Rasmus · 07-08-2011, 10:19 AM

I will need a bit more help, it seems.

This is a clean Python example:

Code:

>>> u = 'http://www.spiegel.de/international/europe/0,1518,773071,00.html'
>>> v = urllib2.urlopen(u).read()
>>> soup = BeautifulSoup(v) # this should be identical to Calibre's Soup
>>> img = soup.find('div', {'class' : 'spGalleryBigPic'}).find('img')
>>> type(img)
     <class 'BeautifulSoup.Tag'>

So basically, I want to insert this into the article (below the heading, but for now I just want to get it into the epub article).

So I wrote the following preprocess function:

Code:

    def preprocess_html(self, soup):
        soup = soup.find('div', {'class' : 'spGalleryBigPic'}).find('img')
        return soup

which returns what I called img above.

I get the article using:

Code:

    def print_version(self, url): 
        'from Spigelde.receipt'
        rmt = url.rpartition('#')[0]
        main, sep, rest = rmt.rpartition(',')
        rmain, rsep, rrest = main.rpartition(',')
        purl = rmain + ',druck-' + rrest + ',' + rest
        return purl

But currently it only works when I do not use preprocess_html.