|
|
#1 |
|
Member
![]() ![]() Posts: 14
Karma: 132
Join Date: Aug 2014
Device: Kindle Paperwhite 7th Gen
|
Creating a recipe for Scroll.in
I created a recipe for scroll.in.
It is working, in the sense that it is creating a proper magazing which is readable on Kindle. However, in the recipe, I have put 20 sections. I get a permanent redirect error in 10 of those sections. The following error message: Code:
URL: https://scroll.in/food <httperror_seek_wrapper (urllib.error.HTTPError instance) at 0x6208928 whose wrapped object = <HTTPError 308: 'PERMANENT REDIRECT'>> Here's the full code: Code:
#!/usr/bin/env python
# vim:fileencoding=utf-8
from calibre.web.feeds.news import BasicNewsRecipe
class AdvancedUserRecipe1602773140(BasicNewsRecipe):
title = 'Scroll.in'
oldest_article = 2
max_articles_per_feed = 20
auto_cleanup = True
compress_news_images = True
compress_news_images_auto_size = 24
def parse_index(self):
category_list = [
('coronavirus-crisis', 'https://scroll.in/topic/56256/coronavirus-crisis'),
('food', 'https://scroll.in/food'),
('latest', 'https://scroll.in/latest'),
('reel', 'https://scroll.in/reel'),
('field', 'https://scroll.in/field'),
('magazine', 'https://scroll.in/magazine'),
('politics', 'https://scroll.in/category/76/politics'),
('culture', 'https://scroll.in/category/107/culture'),
('india', 'https://scroll.in/category/105/india'),
('world', 'https://scroll.in/category/3554/world'),
('film-and-tv', 'https://scroll.in/category/3/film-and-tv'),
('music', 'https://scroll.in/category/4/music'),
('books-and-ideas', 'https://scroll.in/category/80/books-and-ideas'),
('business-and-economy', 'https://scroll.in/category/77/business-and-economy'),
('science-and-technology', 'https://scroll.in/category/83/science-and-technology'),
('roving', 'https://scroll.in/roving'),
('global', 'https://scroll.in/global'),
('announcements', 'https://scroll.in/announcements'),
('pulse', 'https://scroll.in/pulse'),
('theplus', 'https://scroll.in/theplus')
]
br = self.get_browser()
feeds = []
for category in category_list:
print('URL: ', category[1])
try:
page = br.open(category[1])
html = page.read()
except Exception as e:
print(repr(e))
continue
soup = BeautifulSoup(html)
stories = soup.find_all('div', class_='row-story-meta')
articles = []
for story in stories:
article = story.find_parent()
author = story.find('address')
author = author.text if author is not None else 'Scroll.in'
article_dict = {'url': article['href'],
'title': story.find('h1').text,
'date': story.find('time')['datetime'],
'author': author}
articles.append(article_dict)
feeds.append((category[0], articles))
return feeds
|
|
|
|
|
|
#2 |
|
creator of calibre
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 45,606
Karma: 28548974
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
Its a redirect loop, the server is buggy, add a trailing slash to it and you will be fine.
|
|
|
|
|
|
#3 |
|
Member
![]() ![]() Posts: 14
Karma: 132
Join Date: Aug 2014
Device: Kindle Paperwhite 7th Gen
|
Thanks, that worked like charm
![]() Can I submit this recipe for inclusion in Calibre? |
|
|
|
|
|
#4 |
|
creator of calibre
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 45,606
Karma: 28548974
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
Yes, of course. Preferably via a pull request, or just attach it here.
|
|
|
|
|
|
#5 |
|
Member
![]() ![]() Posts: 14
Karma: 132
Join Date: Aug 2014
Device: Kindle Paperwhite 7th Gen
|
Thanks. I'll try to create a pull request. If my limited git skills fail me, then I'll attach it here.
|
|
|
|
![]() |
|
Similar Threads
|
||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| Creating a recipe for theprint.in | gourav | Recipes | 4 | 10-16-2020 12:50 PM |
| Pixel scroll driven by mouse scroll wheel | PeterButler | Development | 3 | 05-05-2020 11:35 PM |
| Creating a Recipe for Engadget Distro? | kichigai | Recipes | 0 | 06-19-2012 04:10 PM |
| Need some help creating a login for a recipe | Selcal | Calibre | 5 | 07-30-2010 07:45 AM |
| Creating a Recipe for PS3 Center? | cypherslock | Calibre | 3 | 12-27-2009 09:29 PM |