MobileRead Forums - View Single Post - Grabbing and including image from another url

Rasmus · 07-07-2011, 10:12 PM

Hi,
I am trying to improve the Spiegel Int'l receipt.

So far I got sections going and removed a lot of noise by using the print version of articles.

However, printed articles does not include images. I like these and would prefer to include them.

I have written the following simple script which grabs an image from the non-printed version of an article:

Code:

    def get_img(self, url):
        txt = BeautifulSoup(urllib2.urlopen(url))
        img = soup.find('div', {'class' : 'spGalleryBigPic'}).find('img')['src']

I assume that using some magic trick ebook-convert knows url (it does in the print function).

My questions are

How do I include this picture in the actual ebook?
Am I 'allowed' to use urllib2 in receipts or is some other method preferred? (Most likely it is)
Should I add some kind of robustness to img or will Calibre handle it? (Probably not; I guess I could just try img to avoid exceptions)

These questions are probably obvious, but I did not seem to be able to find all of the documentation that I wanted in on receipt...

Cheers,
Rasmus

PS: I wrote a receipt for Economist's Daily Chart. Should I share it or what is the custom for these things? Should I share my improved Spiegel receipt?

07-07-2011, 10:12 PM	#1
Rasmus Junior Member Posts: 5 Karma: 10 Join Date: Jul 2011 Device: Kindle3	Grabbing and including image from another url Hi, I am trying to improve the Spiegel Int'l receipt. So far I got sections going and removed a lot of noise by using the print version of articles. However, printed articles does not include images. I like these and would prefer to include them. I have written the following simple script which grabs an image from the non-printed version of an article: Code: def get_img(self, url): txt = BeautifulSoup(urllib2.urlopen(url)) img = soup.find('div', {'class' : 'spGalleryBigPic'}).find('img')['src'] I assume that using some magic trick ebook-convert knows url (it does in the print function). My questions are How do I include this picture in the actual ebook? Am I 'allowed' to use urllib2 in receipts or is some other method preferred? (Most likely it is) Should I add some kind of robustness to img or will Calibre handle it? (Probably not; I guess I could just try img to avoid exceptions) These questions are probably obvious, but I did not seem to be able to find all of the documentation that I wanted in on receipt... Cheers, Rasmus PS: I wrote a receipt for Economist's Daily Chart. Should I share it or what is the custom for these things? Should I share my improved Spiegel receipt?