View Single Post
Old 02-10-2012, 01:02 PM   #21
kiavash
Old Linux User
kiavash began at the beginning.
 
Posts: 36
Karma: 12
Join Date: Jan 2012
Device: NST
Embed images into an ebook

Some sites don't include the figures/images into articles and instead the reader needs to click on an href link to see the image/figure. This wouldn't be possible on many ebook readers. To embed the images into output ebook, the tag type needs to be changed from <a> to <img>. Also the "href" property needs to be changed to "src". The following code does the job by looking for all the links to jpg files, then changed them to <img> tags.The code should be included into preprocess_html

Spoiler:
PHP Code:
    def preprocess_html(selfsoup):
    
        
# Includes all the figures inside the final ebook
        # Finds all the jpg links
        
for figure in soup.findAll('a'attrs = {'href' lambda xand 'jpg' in x}):
            
            
# makes sure that the link points to the absolute web address
            
if figure['href'].startswith('/'):
                
figure['href'] = self.site figure['href']
                
            
figure.name 'img' # converts the links to img
            
figure['src'] = figure['href'# with the same address as href
            
figure['style'] = 'display:block' # adds /n before and after the image
            
del figure['href']
            
del figure['target']
        return 
soup 
kiavash is offline   Reply With Quote