Hi Kunvp,
looking at my
changes to the gva_be.recipe will probably not help you very much to understand how to work on other recipes. I removed some obsolete code which makes the change look bigger than it actually was.
As far as I can see, all of the Belgian Dutch news sources have a valid table of contents. This means the feed addresses are still correct, but there's something wrong with the extraction of the content. Modifying the
keep_only_tags and
remove_tags sections should be sufficient in this case.
For example, if you look at the
demorgen_be.recipe you will find the line:
Code:
keep_only_tags = [dict(name='div' , attrs={'class':'art_box2'})]
which means that Calibre expects the content to be wrapped into an html tag like
<div class="art_box2">...</div>. But if you look at the source code of an arbitrary article (picture attached) you will see that the relevant tag is
<div class="article__wrapper">...</div>. By changing the line above to:
Code:
keep_only_tags = [dict(name='div' , attrs={'class':'article__wrapper'})]
you should get a working recipe (didn't try it myself).
For an in-depth explanation of recipe programming just have a look at the Calibre documentation:
https://manual.calibre-ebook.com/news.html