Quote:
Originally Posted by kovidgoyal
That should work just fine. The news download system looks for images to download after preprocess has run. Look in the log to see why the images are not downloading. Also rather than using replaceWith just set
img.name = 'img'
img['src'] = 'whatever'
|
After fetching all the images using the above code, they all become inline with the text. I would like to put a new line between an image and the text before/after. Tried couple of techniques including Tag(soup,'br /') and tag.insert but all ended up eliminating the image all together in the final file.
I also attached the example epub that shows the behavior I am referring to.
Spoiler:
PHP Code:
def preprocess_html(self, soup):
# Includes all the figures inside the final ebook
# Finds all the jpg links
for figure in soup.findAll('a', attrs = {'href' : lambda x: x and 'jpg' in x}):
# makes sure that the link points to the absolute web address
if figure['href'].startswith('/'):
figure['href'] = self.site + figure['href']
figure.name = 'img' # converts the links to img
figure['src'] = figure['href'] # with the same address as href
del figure['href']
del figure['target']
return soup
Any idea?