![]() |
#1 |
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 1,050
Karma: 11391181
Join Date: Mar 2013
Location: Guben, Brandenburg, Germany
Device: Kobo Clara 2E, Tolino Shine 3
|
No more images downloaded
I'm having a recipe for the german journal "Tagespost", it's this one:
Code:
#!/usr/bin/env python # vim:fileencoding=utf-8 from __future__ import unicode_literals, division, absolute_import, print_function __license__ = 'GPL v3' __copyright__ = '2020, Pat Stapleton <pat.stapleton at gmail.com>' ''' Recipe for Die Tagespost ''' from calibre.web.feeds.news import BasicNewsRecipe class AdvancedUserRecipe1589629735(BasicNewsRecipe): title = 'Tagespost' language = 'de' __author__ = 'Pat Stapleton' description = ('Die Tagespost trägt den Untertitel Wochenzeitung für Politik, Gesellschaft' ' und Kultur und ist eine überregionale, wöchentlich im Johann Wilhelm Naumann Verlag in Würzburg erscheinende Zeitung.') oldest_article = 7 max_articles_per_feed = 100 auto_cleanup = True use_embedded_content = False feeds = [ ('Tagespost', 'https://www.die-tagespost.de/storage/rss/rss/die-tagespost-komplett.xml'), ] extra_css = 'td.textb {font-size: medium;} * { text-align: justify !important; text-decoration: none !important}' remove_attributes = ['href'] calibre_most_common_ua = 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/80.0.3987.87 Safari/537.36' I'm quite satisfied with the recipe, so my remark is more a question out of interest than a request for support: Are there technical reasons for this behaviour, or am I doing something wrong? - In the current of time, I admit, the content of the journal has considerably augmented: from about 100 pages (on my e-reader) up to 200 and more. |
![]() |
![]() |
![]() |
#2 |
Fanatic
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 598
Karma: 85520
Join Date: May 2021
Device: kindle
|
auto_cleanup = True in recipe.
https://manual.calibre-ebook.com/new...Recipe.cleanup you can use keep_only_tags & remove_tags to keep only those html tags you think has text and images and remove unnecessary tags to fix it. https://manual.calibre-ebook.com/new...keep_only_tags |
![]() |
![]() |
Advert | |
|
![]() |
#3 |
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 1,050
Karma: 11391181
Join Date: Mar 2013
Location: Guben, Brandenburg, Germany
Device: Kobo Clara 2E, Tolino Shine 3
|
@unkn0wn: Thank you, so far! auto_cleanup = True is already in the recipe. Do you mean that I have to change that?
Sorry, I'm no technic at all! |
![]() |
![]() |
![]() |
#4 | |
Fanatic
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 598
Karma: 85520
Join Date: May 2021
Device: kindle
|
Quote:
I thought you didn't want support. So I pointed you towards documentation. |
|
![]() |
![]() |
![]() |
#5 |
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 1,050
Karma: 11391181
Join Date: Mar 2013
Location: Guben, Brandenburg, Germany
Device: Kobo Clara 2E, Tolino Shine 3
|
Oh, I should well enjoy having my images back!
So, as I understand, I should replace: Code:
auto_cleanup = True Code:
keep_only_tags Edit: I tried, but there comes an error message: "keep_only_tags should be defined". Would it be correct to write: Code:
keep_only_tags = True Sorry again for my ignorance! Last edited by Leonatus; 03-04-2023 at 04:06 AM. |
![]() |
![]() |
Advert | |
|
![]() |
#6 |
Fanatic
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 598
Karma: 85520
Join Date: May 2021
Device: kindle
|
Code:
''' Recipe for Die Tagespost ''' from calibre.web.feeds.news import BasicNewsRecipe class tagespost(BasicNewsRecipe): title = 'Tagespost' language = 'de' __author__ = 'unkn0wn' description = ('Die Tagespost trägt den Untertitel Wochenzeitung für Politik, Gesellschaft' ' und Kultur und ist eine überregionale, wöchentlich im Johann Wilhelm Naumann Verlag in Würzburg erscheinende Zeitung.') oldest_article = 7 max_articles_per_feed = 100 use_embedded_content = False keep_only_tags = [ dict(name='article', attrs={'class':'art-detail'}) ] feeds = [ ('Tagespost', 'https://www.die-tagespost.de/storage/rss/rss/die-tagespost-komplett.xml'), ] |
![]() |
![]() |
![]() |
#7 |
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 1,050
Karma: 11391181
Join Date: Mar 2013
Location: Guben, Brandenburg, Germany
Device: Kobo Clara 2E, Tolino Shine 3
|
Wonderful! Not only that the images appear again, but the entire layout seems prettier!
So great! Thank you! |
![]() |
![]() |
![]() |
|
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
Recipe fails to download some images due to slow loading of the images | itssudipok | Recipes | 2 | 07-05-2022 01:05 PM |
How do you get rid of all images in an ePub file downloaded from Archive.org? | 2scre | ePub | 9 | 06-14-2021 09:19 PM |
Images not being downloaded - new recipe | masoud77 | Recipes | 1 | 09-04-2018 09:43 PM |
How to change the Sigil Images folder name to images | davidspring | Sigil | 29 | 02-12-2018 05:00 AM |
Images of ChessCafe.com not downloaded | peterle | Recipes | 2 | 05-18-2013 08:12 AM |