07-28-2017, 11:53 AM | #1 |
Connoisseur
Posts: 82
Karma: 10
Join Date: Dec 2015
Device: Kindle
|
Washington Examiner byline - photos
Is it possible to include byline author + date and photos in this recipe? Created using other scripts, not in standard recipe list. Works well for article heading and body as-is. Thanks.
------------------------------------------------------------------------ #!/usr/bin/env python2 # vim:fileencoding=utf-8 # License: GPLv3 Copyright: 2016, Kovid Goyal <kovid at kovidgoyal.net> from __future__ import (unicode_literals, division, absolute_import, print_function) from calibre.web.feeds.news import BasicNewsRecipe def classes(classes): q = frozenset(classes.split(' ')) return dict(attrs={ 'class': lambda x: x and frozenset(x.split()).intersection(q)}) class WashingtonExaminer(BasicNewsRecipe): title = u'Washintgon Examiner' oldest_article = 2 language = 'en' remove_empty_feeds = True extra_css = """ body{font-family: Arial,sans-serif } .caption{font-size: x-small} .author,.datePub{font-size: small} """ __author__ = 'Kovid Goyal' simultaneous_downloads = 4 max_articles_per_feed = 20 use_embedded_content = False compress_news_images = True compress_news_images_auto_size = 8 no_stylesheets = True use_embedded_content = False auto_cleanup = True ignore_duplicate_articles = {'title', 'url'} feeds = [ ('News', 'http://www.washingtonexaminer.com/rss/news'), ('Politics', 'http://www.washingtonexaminer.com/rss/politics'), ('Editorial', 'http://www.washingtonexaminer.com/rss/editorials'), ('Policy', 'http://http://washingtonexaminer.com/rss/policy'), ('Opinion', 'http://www.washingtonexaminer.com/rss/opinion'), ('Columnists', 'http://www.washingtonexaminer.com/rss/columnists'), ('Magazine', 'http://www.washingtonexaminer.com/rss/magazine'), ] #copied in, not working to present images def preprocess_html(self, soup): for img in soup.findAll(attrs={'data-src':True}): img['src'] = img['data-src'] all_h1s = soup.findAll('h1') for h1 in all_h1s[1:]: h1.extract() return soup |
07-29-2017, 03:41 AM | #2 |
creator of calibre
Posts: 43,858
Karma: 22666666
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
You will need to remove auto_cleanup = True and use keep_tags/remove_tags instead.
|
08-28-2017, 09:09 PM | #3 |
Connoisseur
Posts: 82
Karma: 10
Join Date: Dec 2015
Device: Kindle
|
Washington Examiner article body font size
Kovid,
I redid the new recipe with keep tags/remove tags. I get the articles with byline author and date consistently, and I believe some of the pictures. Now I could not get the article body to a normal (smaller) font size with extra_css command. The extra_css eliminates bolding of article body text but no response reducing the font. Could you advise on that? Thanks in advance. Current recipe Spoiler:
URL for the new source RSS feeds - http://www.washingtonexaminer.com/rs...n=%2Fnation%2F |
08-28-2017, 10:39 PM | #4 |
creator of calibre
Posts: 43,858
Karma: 22666666
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
Add an !important to the css rule to make sure it overrides anything in the input document.
|
08-29-2017, 11:10 AM | #5 |
Connoisseur
Posts: 82
Karma: 10
Join Date: Dec 2015
Device: Kindle
|
I tried changing the rule to the following, still does not respond with a smaller font for article body text.
extra_css = 'body{font-size: x-small !important; font-weight: normal }' |
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
NY Times byline issue | jfhutson | Recipes | 2 | 01-28-2017 10:09 AM |
New Yorker Recipe missing byline info | parisfrog | Recipes | 0 | 12-16-2014 10:06 AM |
The Examiner reviews Singapore author's spy novel, Smokescreen | Khaled Talib | Self-Promotions by Authors and Publishers | 0 | 12-07-2014 08:01 PM |
Author byline is not appearing on my recipe | rylsfan | Recipes | 4 | 03-02-2011 12:40 PM |
Examiner: HP webOS 'PalmPad' tablet will have digital pen | kjk | News | 17 | 07-22-2010 12:34 PM |