Quote:
Originally Posted by aoitenshi
Hello, I posted a request but I decided to try my hand at it by using the Basic editor for custom news source on Calibre. I followed the guide on the website but I obviously don't understand enough code to get very far......
The website doesn't provide full RSS feeds so I try to load up the print version of linked articles. What I don't understand is that I seem to be getting the header / footer but I can't see the article itself. It's a free newspaper so all the content should load.
Why is this so?
|
I became interested in this, so I decided to rewrite my comments as a new post. You can mostly ignore the one above as it relates to getting through the RSS feed, and that is not needed.
Your RSS feeds do have cookies, etc., but AFAICT, they simply send you to the same place every time. So the .../RSS/HotNews RSS feed just sends you to the .../Hotnews page. Modifying your feed addresses by removing "/RSS" will give you everything that the RSS feed does with a lot less trouble.
The destination page for each of your feeds has an article teaser, with a "Read More" link inside an <a> tag having id=moreLink.
I'd approach this as follows:
Use the parse_index method described
here:
Then use soup (as described there) to grab the moreLink as your article for that feed.