Quote:
Originally Posted by clintiepoo
I got the double-title to go away with this code.
Code:
remove_tags = [
dict(name='a')
]
I'm still not sure how to get the date and picture to show up on different lines. Anybody?
Eventually, I'd like to format the headline and date fonts to a different format too.
|
Guys,
please help. How do I put spaces between the different tags I'm using? Right now, everything is stringing together in one big line. This can't be that hard.
How it is:
dateIMAGEcaption
I want:
date
IMAGE
caption
Code:
#!/usr/bin/env python
'''
http://www.herald-review.com
'''
from calibre.web.feeds.news import BasicNewsRecipe
class DecaturHerald(BasicNewsRecipe):
title = u'Herald and Review'
__author__ = u'Clint'
description = u"Decatur, IL Newspaper"
oldest_article = 7
language = 'en'
max_articles_per_feed = 100
no_stylesheets = True
use_embedded_content = False
extra_css = '''
h1 {text-align:left;}
.updated {font-family:monospace;text-align:left;margin-bottom: 1em;}
.img {text-align:center;}
.gallery-cutline {text-align:center;font-size:smaller;font-style:italic}
.credit {text-align:right;margin-bottom:0em;font-size:smaller;}
.div {text-align:left;}
'''
cover_url = 'http://www.herald-review.com/content/tncms/live/global/resources/images/hr_logo.jpg'
keep_only_tags = [
dict(name='h1'),
dict(name='span', attrs={'class':'updated'}),
dict(name='img', attrs={'id':'img-holder'}),
dict(name='span', attrs={'id':'gallery-cutline'}),
dict(name='div', attrs={'id':'blox-story-text'})
]
remove_tags = [
dict(name='a')
]
feeds = [
(u'Local News', u'http://www.herald-review.com/search/?f=rss&c[]=news/local&sd=desc&s=start_time'),
# (u'Breaking News', u'http://www.herald-review.com/search/?f=rss&k[]=%23breaking&sd=desc&s=start_time'),
# (u'State and Regional ', u'http://www.herald-review.com/search/?f=rss&c[]=news/state-and-regional&sd=desc&s=start_time'),
# (u'Crime and courts', u'http://www.herald-review.com/search/?f=rss&c[]=news/local/crime-and-courts&sd=desc&s=start_time'),
# (u'Local Business ', u'http://www.herald-review.com/search/?f=rss&c[]=business/local&sd=desc&s=start_time'),
# (u'Editorials', u'http://www.herald-review.com/search/?f=rss&c[]=news/opinion/editorial&sd=desc&s=start_time'),
# (u'Illini News', u'http://www.herald-review.com/search/?f=rss&q=illini&sd=desc&s=start_time')
]