Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Software > Calibre > Recipes

Notices

Reply
 
Thread Tools Search this Thread
Old 09-16-2011, 03:31 AM   #1
fenuks
Enthusiast
fenuks began at the beginning.
 
Posts: 34
Karma: 10
Join Date: Aug 2011
Device: Amazon Kindle 3
How to force calibre to download images?

Hello. I made recipe for page which uses flash player to display images (cgm.pl). In preprocess_html function I extract image ID from flash tag and replace it with image tag. Problem is that that images aren't downloaded. I meant they aren't downloaded to file. When I open it calibre ebook viewer download all photos to memory. How to force calibre to download those images to file? Thanks for help!

preprocess_html function
Spoiler:
Code:
def preprocess_html(self, soup):
        gallery=soup.find('div', attrs={'class':'galleryFlash'})
        if gallery:
            img=gallery.find('embed')
            if img:
                img=img['src'][35:]
                gallery.replaceWith('<img src="http://www.cgm.pl/_vault/_gallery/_photo/'+img+'" />')
            
        return soup


flash tag
Spoiler:
Code:
<div class="galleryFlash">
                                        <object classid="clsid:d27cdb6e-ae6d-11cf-96b8-444553540000" codebase="http://fpdownload.macromedia.com/pub/shockwave/cabs/flash/swflash.cab#version=8,0,0,0" width="689" height="442" id="single_photo" align="middle">
                                            <param name="allowScriptAccess" value="sameDomain">
                                            <param name="movie" value="_flash/single_photo.swf?photo_file=21578.jpg">

                                            <param name="quality" value="high">
                                            <param name="bgcolor" value="#000000">
                                            <embed src="_flash/single_photo.swf?photo_file=21578.jpg" quality="high" bgcolor="#000000" width="689" height="442" name="single_photo" align="middle" allowscriptaccess="sameDomain" type="application/x-shockwave-flash" pluginspage="http://www.macromedia.com/go/getflashplayer">
                                        </object>
                                    </div>
fenuks is offline   Reply With Quote
Old 09-16-2011, 11:32 AM   #2
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 45,345
Karma: 27182818
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
That should work just fine. The news download system looks for images to download after preprocess has run. Look in the log to see why the images are not downloading. Also rather than using replaceWith just set

img.name = 'img'
img['src'] = 'whatever'
kovidgoyal is offline   Reply With Quote
Advert
Old 09-17-2011, 04:56 AM   #3
fenuks
Enthusiast
fenuks began at the beginning.
 
Posts: 34
Karma: 10
Join Date: Aug 2011
Device: Amazon Kindle 3
Thanks. That works.
Quote:
Originally Posted by kovidgoyal View Post
img.name = 'img'
img['src'] = 'whatever'
I looked at log and when I used replaceWith() function calibre didn't know that it must fetch those images. Same thing when I did insert
Code:
sometag.insert(pos, '<img src="link" />')
. When I parsed this tag into soup it inserts '\n'. Anyway, first method works. Best regards.
fenuks is offline   Reply With Quote
Old 02-06-2012, 05:24 PM   #4
kiavash
Old Linux User
kiavash began at the beginning.
 
Posts: 36
Karma: 12
Join Date: Jan 2012
Device: NST
Thumbs up

Thanks for this post.

This is how I achived the same. You can see I used the same trick to make article title better standing out.

Spoiler:
PHP Code:
    def preprocess_html(selfsoup):
        
# Finds all the jpg links
        
for figure in soup.findAll('a'attrs = {'href' lambda xand 'jpg' in x}):
            
            
# makes sure that the link points to the absolute web address
            
if figure['href'].startswith('/'):
                
figure['href'] = self.site figure['href']
            
            
figure.name 'img' # converts the links to img
            
figure['src']= figure['href'# with the same address as href        
        
        # Makes the title standing out
        
title soup.find('a'attrs = {'class''commonSectionTitle'})
        
title.name 'h1'

        
return soup 
kiavash is offline   Reply With Quote
Old 02-09-2012, 05:24 PM   #5
kiavash
Old Linux User
kiavash began at the beginning.
 
Posts: 36
Karma: 12
Join Date: Jan 2012
Device: NST
Question ... and make it better?

Quote:
Originally Posted by kovidgoyal View Post
That should work just fine. The news download system looks for images to download after preprocess has run. Look in the log to see why the images are not downloading. Also rather than using replaceWith just set

img.name = 'img'
img['src'] = 'whatever'


After fetching all the images using the above code, they all become inline with the text. I would like to put a new line between an image and the text before/after. Tried couple of techniques including Tag(soup,'br /') and tag.insert but all ended up eliminating the image all together in the final file.

I also attached the example epub that shows the behavior I am referring to.
Spoiler:

PHP Code:
    def preprocess_html(selfsoup):
        
# Includes all the figures inside the final ebook
        # Finds all the jpg links
        
for figure in soup.findAll('a'attrs = {'href' lambda xand 'jpg' in x}):
            
            
# makes sure that the link points to the absolute web address
            
if figure['href'].startswith('/'):
                
figure['href'] = self.site figure['href']
                
            
figure.name 'img' # converts the links to img
            
figure['src'] = figure['href'# with the same address as href
            
del figure['href']
            
del figure['target']
        return 
soup 


Any idea?
Attached Files
File Type: zip mwrf.zip (1.01 MB, 269 views)

Last edited by kiavash; 02-09-2012 at 05:59 PM.
kiavash is offline   Reply With Quote
Advert
Old 02-09-2012, 10:49 PM   #6
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 45,345
Karma: 27182818
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
img['style'] = 'display:block'
kovidgoyal is offline   Reply With Quote
Old 02-10-2012, 12:55 PM   #7
kiavash
Old Linux User
kiavash began at the beginning.
 
Posts: 36
Karma: 12
Join Date: Jan 2012
Device: NST
Perfect! thanks!
kiavash is offline   Reply With Quote
Reply


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Force Calibre to make one Job at a time? JayKindle Calibre 2 09-11-2011 04:05 PM
Force Download face Nook Color & Nook Tablet 11 01-16-2011 06:10 AM
Force Sony Reader Library to re-download all my books? stodge Sony Reader 4 11-13-2010 03:45 PM
How do I force Calibre to request format change on PDF lunixer Calibre 6 08-11-2010 01:00 PM
How can I force Calibre to use a certain format for conversion? Sydney's Mom Calibre 5 02-01-2010 11:43 AM


All times are GMT -4. The time now is 12:26 AM.


MobileRead.com is a privately owned, operated and funded community.