View Single Post
Old 08-28-2017, 09:09 PM   #3
jma1
Connoisseur
jma1 began at the beginning.
 
Posts: 85
Karma: 10
Join Date: Dec 2015
Device: Kindle
Washington Examiner article body font size

Kovid,
I redid the new recipe with keep tags/remove tags. I get the articles with byline author and date consistently, and I believe some of the pictures.
Now I could not get the article body to a normal (smaller) font size with extra_css command. The extra_css eliminates bolding of article body text but no response reducing the font. Could you advise on that? Thanks in advance.

Current recipe

Spoiler:
#!/usr/bin/env python2
# vim:fileencoding=utf-8
# License: GPLv3 Copyright: 2016, Kovid Goyal <kovid at kovidgoyal.net>

from __future__ import (unicode_literals, division, absolute_import, print_function)
from calibre.web.feeds.news import BasicNewsRecipe

def classes(classes):
q = frozenset(classes.split(' '))
return dict(attrs={
'class': lambda x: x and frozenset(x.split()).intersection(q)})

class WashingtonExaminer(BasicNewsRecipe):
title = u'Washington Examiner'
__author__ = 'Kovid Goyal'
oldest_article = 2
max_articles_per_feed = 10
use_embedded_content = False
compress_news_images = True
compress_news_images_auto_size = 8
no_stylesheets = True
encoding = 'utf8'
use_embedded_content = False

language = 'en'
remove_empty_feeds = True

extra_css = 'body{font-size: 0.8em; font-weight: normal }'
# extra_css = '.author{font-weight: normal; font-size: x-small}'
# extra_css = '.caption{font-size: x-small}'

ignore_duplicate_articles = {'title', 'url'}
keep_only_tags = [
dict(itemprop=['headline', 'author', 'datePublished', 'articleBody']),
dict(name='h1'),
classes('article-body featured-image'),
]

feeds = [
('News', 'http://www.washingtonexaminer.com/rss/news'),

]

def preprocess_html(self, soup):
for img in soup.findAll(attrs={'data-src':True}):
img['src'] = img['data-src']
all_h1s = soup.findAll('h1')
for h1 in all_h1s[1:]:
h1.extract()
return soup


URL for the new source RSS feeds -

http://www.washingtonexaminer.com/rs...n=%2Fnation%2F
jma1 is offline   Reply With Quote