![]() |
#1 |
Guru
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 735
Karma: 35936
Join Date: Apr 2011
Location: Shrewsury, MA
Device: Lenovo Android Tablet
|
Popular Science recipe is broken.
It just downloads titles and links:
Five rad and random things I found this week
The end-of-week dispatch from PopSci's commerce editor. Vol. 27. By Billy Cadden posted Oct 27th, 2017 at 12:15pm This article was downloaded by calibre from https://www.popsci.com/rad-and-rando...efault&src=syn |
![]() |
![]() |
![]() |
#2 |
Enthusiast
![]() Posts: 36
Karma: 10
Join Date: Dec 2017
Location: Los Angeles, CA
Device: Smart Phone
|
Hello There,
I updated the recipe so that it finds the body of the article again. Apparently, they changed the CSS class name for the div containing the main text. Anyways, here's the recipe: Popular Science Code:
from calibre.web.feeds.news import BasicNewsRecipe import re class AdvancedUserRecipe1282101454(BasicNewsRecipe): title = 'Popular Science' language = 'en' __author__ = 'Kovid Goyal' description = 'Popular Science' publisher = 'Popular Science' oldest_article = 7 # change this if you want more current articles. I like to go a week in max_articles_per_feed = 100 no_stylesheets = True remove_javascript = True use_embedded_content = False remove_empty_feeds = True ignore_duplicate_articles = {'url'} feeds = [ ('Gadgets', 'http://www.popsci.com/full-feed/gadgets'), ('Cars', 'http://www.popsci.com/full-feed/cars'), ('Science', 'http://www.popsci.com/full-feed/science'), ('Technology', 'http://www.popsci.com/full-feed/technology'), ('DIY', 'http://www.popsci.com/full-feed/diy'), ('Animals', 'https://www.popsci.com/rss-animals.xml'), ('Space', 'https://www.popsci.com/rss-space.xml'), ('Environment', 'https://www.popsci.com/rss-environment.xml'), ('Eastern Arsenal', 'https://www.popsci.com/rss-eastern-arsenal.xml'), ] pane_node_body = re.compile('pane-node-(?:\w+-){0,9}body') keep_only_tags = [ dict(attrs={'class': lambda x: x and frozenset('pane-node-header'.split()).issubset(frozenset(x.split())) }), dict(attrs={'class': pane_node_body}), ] remove_tags = [ dict(attrs={'class': lambda x: x and frozenset('ads seperator'.split()).issubset(frozenset(x.split())) }), ] def preprocess_html(self, soup): for img in soup.findAll('img', attrs={'data-medsrc': True}): img['src'] = img['data-medsrc'] return soup Last edited by lui1; 12-28-2017 at 04:52 AM. |
![]() |
![]() |
Advert | |
|
![]() |
#3 |
Guru
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 735
Karma: 35936
Join Date: Apr 2011
Location: Shrewsury, MA
Device: Lenovo Android Tablet
|
Thanks! Looks quite good now!
Happy new year. |
![]() |
![]() |
![]() |
|
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
Popular Science recipe needs help | NSILMike | Recipes | 2 | 05-27-2015 08:06 AM |
Good Popular Science Books? | bonaldo2000 | Reading Recommendations | 8 | 12-22-2011 07:27 AM |
Recipe - Popular Science (Australian Ed) | lmcbean | Recipes | 0 | 05-01-2011 05:47 PM |
Popular Science | mhuntoon | Calibre | 2 | 03-08-2010 12:23 PM |
Popular Science and Calibre | rcuadro | Calibre | 1 | 10-26-2009 10:57 AM |